Stencil computation is widely used in scientific computations and many accelerators based on multicore CPUs and GPUs have been proposed. Stencil computation has a small operational intensity so that a large external memory bandwidth is usually required for high performance. FPGAs have the potential to solve this problem by utilizing large internal memory efficiently. However, a very large design, testing and debugging time is required to implement an FPGA architecture successfully. To solve this problem, we propose an FPGA-platform using C-like programming language called open computing language (OpenCL). We also propose an optimization methodology to find the optimal architecture for a given application using the proposed FPFA-platform. According to the experimental results, we achieved 119 ∼ 237 Gflop/s of processing power and higher processing speed compared to conventional GPU and multicore CPU implementations.
|ジャーナル||IEEE Transactions on Parallel and Distributed Systems|
|出版ステータス||Published - 2017 5月 1|
ASJC Scopus subject areas