Multi-FPGA Accelerator Architecture for Stencil Computation Exploiting Spacial and Temporal Scalability

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)


After the introduction of the OpenCL-based FPGA accelerator design method, FPGAs are getting very popular among high-performance computing. The key to achieving high performance using FPGAs is to design pipelined accelerators. We can increase the pipeline depth beyond the border of one FPGA by connecting multiple FPGAs using high-speed QSFP (quad small form-factor pluggable) connectors. Such a deeply-pipelined accelerator using multiple FPGAs works similar to a single very large FPGA. In this paper, we propose a multi-FPGA accelerator architecture for stencil computation by scaling in spacial and temporal dimensions. According to the experimental results, we achieved performance up to 950 GFLOP/s using one FPGA and nearly doubled the performance using two FPGAs. We achieved a high power-efficiency with competitive performances compared to high-end GPUs.

Original languageEnglish
Article number8689014
Pages (from-to)53188-53201
Number of pages14
JournalIEEE Access
Publication statusPublished - 2019


  • OpenCL for FPGA
  • high performance computing
  • multi-FPGA acceleration
  • stencil computation

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)


Dive into the research topics of 'Multi-FPGA Accelerator Architecture for Stencil Computation Exploiting Spacial and Temporal Scalability'. Together they form a unique fingerprint.

Cite this