TY - JOUR
T1 - OpenCL-based FPGA-platform for stencil computation and its optimization methodology
AU - Waidyasooriya, Hasitha Muthumala
AU - Takei, Yasuhiro
AU - Tatsumi, Shunsuke
AU - Hariyama, Masanori
N1 - Publisher Copyright:
© 1990-2012 IEEE.
PY - 2017/5/1
Y1 - 2017/5/1
N2 - Stencil computation is widely used in scientific computations and many accelerators based on multicore CPUs and GPUs have been proposed. Stencil computation has a small operational intensity so that a large external memory bandwidth is usually required for high performance. FPGAs have the potential to solve this problem by utilizing large internal memory efficiently. However, a very large design, testing and debugging time is required to implement an FPGA architecture successfully. To solve this problem, we propose an FPGA-platform using C-like programming language called open computing language (OpenCL). We also propose an optimization methodology to find the optimal architecture for a given application using the proposed FPFA-platform. According to the experimental results, we achieved 119 ∼ 237 Gflop/s of processing power and higher processing speed compared to conventional GPU and multicore CPU implementations.
AB - Stencil computation is widely used in scientific computations and many accelerators based on multicore CPUs and GPUs have been proposed. Stencil computation has a small operational intensity so that a large external memory bandwidth is usually required for high performance. FPGAs have the potential to solve this problem by utilizing large internal memory efficiently. However, a very large design, testing and debugging time is required to implement an FPGA architecture successfully. To solve this problem, we propose an FPGA-platform using C-like programming language called open computing language (OpenCL). We also propose an optimization methodology to find the optimal architecture for a given application using the proposed FPFA-platform. According to the experimental results, we achieved 119 ∼ 237 Gflop/s of processing power and higher processing speed compared to conventional GPU and multicore CPU implementations.
KW - FDTD
KW - OpenCL for FPGA
KW - high performance computing
KW - stencil computation
UR - http://www.scopus.com/inward/record.url?scp=85018165790&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85018165790&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2016.2614981
DO - 10.1109/TPDS.2016.2614981
M3 - Article
AN - SCOPUS:85018165790
VL - 28
SP - 1390
EP - 1402
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
SN - 1045-9219
IS - 5
M1 - 7582502
ER -