TY - GEN
T1 - An OpenCL-Like Offload Programming Framework for SX-Aurora TSUBASA
AU - Takizawa, Hiroyuki
AU - Shiotsuki, Shinji
AU - Ebata, Naoki
AU - Egawa, Ryusuke
N1 - Funding Information:
This work is partially supported by MEXT Next Generation High-Performance Computing Infrastructures and Applications R&D Program “R&D of A Quantum-Annealing-Assisted Next Generation HPC Infrastructure and its Applications” and Grant-in-Aid for Scientific Research(B) #16H02822 and #17H01706.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - This paper presents an OpenCL-like offload programming framework for NEC SX-Aurora TSUBASA (SXAurora). Unlike traditional vector systems, one node of an SXAurora system consists of a host processor and some vector processors on PCI-Express cards, which are called a vector host and vector engines, respectively. Since the standard OpenCL execution model does not naturally fit in the vector engine, this paper discusses how to adapt the OpenCL specification to SXAurora while considering the trade off between performance and code portability. Performance evaluation results clearly demonstrate that, with a moderate programming effort, the proposed framework can express the collaboration between a vector host and a vector engine so as to make a good use of both of the two different processors. By delegating the right task to the right processor, an OpenCL-like program can fully exploit the performance of SX-Aurora.
AB - This paper presents an OpenCL-like offload programming framework for NEC SX-Aurora TSUBASA (SXAurora). Unlike traditional vector systems, one node of an SXAurora system consists of a host processor and some vector processors on PCI-Express cards, which are called a vector host and vector engines, respectively. Since the standard OpenCL execution model does not naturally fit in the vector engine, this paper discusses how to adapt the OpenCL specification to SXAurora while considering the trade off between performance and code portability. Performance evaluation results clearly demonstrate that, with a moderate programming effort, the proposed framework can express the collaboration between a vector host and a vector engine so as to make a good use of both of the two different processors. By delegating the right task to the right processor, an OpenCL-like program can fully exploit the performance of SX-Aurora.
KW - Offload programming
KW - OpenCL
KW - SX aurora TSUBASA
UR - http://www.scopus.com/inward/record.url?scp=85083191772&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083191772&partnerID=8YFLogxK
U2 - 10.1109/PDCAT46702.2019.00059
DO - 10.1109/PDCAT46702.2019.00059
M3 - Conference contribution
AN - SCOPUS:85083191772
T3 - Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
SP - 282
EP - 288
BT - Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
A2 - Tian, Hui
A2 - Shen, Hong
A2 - Tan, Wee Lum
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
Y2 - 5 December 2019 through 7 December 2019
ER -