TY - GEN
T1 - Benchmarks for FPGA-Targeted High-Level-Synthesis
AU - Waidyasooriya, Hasitha Muthumala
AU - Iimura, Yasuaki
AU - Hariyama, Masanori
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Recently, high-level synthesis (HLS) tools such as 'Intel FPGA SDK for OpenCL" and 'Xilinx SDAccel' have been introduced to design FPGA accelerators. Those HLS tools use 'C language' based environment to significantly reduce the design time. However, it is also important to know how much performance we can achieve using HLS tools. FPGA is a highly reconfigurable hardware and the performances are extremely different depending on the architecture. Performances also depend on the FPGA board, HLS software and firmware such as BSP (board support package). Therefore, benchmarking FPGAs is an extremely challenging task. This paper proposes a method to design benchmarks for FPGA-targeted HLS. The benchmarks are highly scalable and can be used for different FPGAs and compilers to obtain most of the potential performance. Evaluating four different FPGAs, we found that the single-precision floating-point computation performance varies from 17 GFLOPS to 3,955 GFLOPS depending on the operation and the FPGA. We have obtained 64% and 43% of the peak performance of single-precision computation for Arria 10 and Stratix 10 FPGAs respectively. The fixed-point computation performance heavily depends on the bit size and varies from 8 GOPS to 12,800 GOPS.
AB - Recently, high-level synthesis (HLS) tools such as 'Intel FPGA SDK for OpenCL" and 'Xilinx SDAccel' have been introduced to design FPGA accelerators. Those HLS tools use 'C language' based environment to significantly reduce the design time. However, it is also important to know how much performance we can achieve using HLS tools. FPGA is a highly reconfigurable hardware and the performances are extremely different depending on the architecture. Performances also depend on the FPGA board, HLS software and firmware such as BSP (board support package). Therefore, benchmarking FPGAs is an extremely challenging task. This paper proposes a method to design benchmarks for FPGA-targeted HLS. The benchmarks are highly scalable and can be used for different FPGAs and compilers to obtain most of the potential performance. Evaluating four different FPGAs, we found that the single-precision floating-point computation performance varies from 17 GFLOPS to 3,955 GFLOPS depending on the operation and the FPGA. We have obtained 64% and 43% of the peak performance of single-precision computation for Arria 10 and Stratix 10 FPGAs respectively. The fixed-point computation performance heavily depends on the bit size and varies from 8 GOPS to 12,800 GOPS.
KW - FPGA accelerator
KW - High performance computing
KW - High-level synthesis
KW - Performance tuning
UR - http://www.scopus.com/inward/record.url?scp=85078941936&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078941936&partnerID=8YFLogxK
U2 - 10.1109/CANDAR.2019.00038
DO - 10.1109/CANDAR.2019.00038
M3 - Conference contribution
AN - SCOPUS:85078941936
T3 - Proceedings - 2019 7th International Symposium on Computing and Networking, CANDAR 2019
SP - 232
EP - 238
BT - Proceedings - 2019 7th International Symposium on Computing and Networking, CANDAR 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Symposium on Computing and Networking, CANDAR 2019
Y2 - 26 November 2019 through 29 November 2019
ER -