Scalable streaming-array of simple soft-processors for stencil computations with constant memory-bandwidth

Kentaro Sano, Yoshiaki Hatsuda, Satoru Yamamoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on multi-core microprocessors and GPGPUs due to its small operationalintensity. In this paper, we propose a scalable streaming-array (SSA) of simple soft-processors for high-performance stencil computation on multiple FPGAs. The SSA architecture allows a multi-device system to have linear scalability of computing performance by deeply pipelining with a constant bandwidth of an external-memory. We present an array-structure of programmable cores optimized for stencil computations and formulate a performance model of pipelined execution on the array. For Jacobi computations, SSA implemented on nine Stratix III FPGAs with the memory bandwidth of only 2 GB/s achieves 260 GFlop/s, corresponding to 87.4 of its peak performance, at 1.3 GFlop/sW. We demonstrate that SSA provides almost linear speedup for larger than medium-sized computation as expected by the performance model. These high utilization and scalability show a big potential of custom computing on reconfigurable devices as a power-efficient and high-performance computing platform.

Original languageEnglish
Title of host publicationProceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011
Pages234-241
Number of pages8
DOIs
Publication statusPublished - 2011
Event19th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011 - Salt Lake City, UT, United States
Duration: 2011 May 12011 May 3

Publication series

NameProceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011

Other

Other19th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011
CountryUnited States
CitySalt Lake City, UT
Period11/5/111/5/3

Keywords

  • FPGA
  • High-performance stencil
  • computation computation
  • scalable streaming-array

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint Dive into the research topics of 'Scalable streaming-array of simple soft-processors for stencil computations with constant memory-bandwidth'. Together they form a unique fingerprint.

Cite this