OpenCL is a new programming specification whose current implementations are mostly used for high-performance computing with graphics processing units(GPUs), so-called GPU computing. However, the OpenCL specification itself is not specialized for GPU computing. In this research project, therefore, we propose to use the OpenCL specification to describe the collaborative work of scalar systems and an NEC SX vector supercomputing system. Since there is no OpenCL implementation for the SX systems, we translate a part of an OpenCL code written in OpenCL C to a standard C++ code. After the translation, the generated code is compiled with a native SX C++ compiler so as to produce an executable program that runs on the SX system. This paper shows a prototype implementation of an OpenCL-to-C translator to evaluate the potential of using the SX system for accelerating OpenCL applications. The evaluation results indicate that an SMP node can outperform a single GPU by improving the vectorization ratio, even though the benchmark programs are completely optimized for GPUs. In addition, as data parallelism is explicitly described in an OpenCL C code, the performance of the code generated by the OpenCL-to-C translator is scalable with the number of SX processors. Accordingly, the SMP node can be used as a very powerful accelerator with a huge memory space.
|Number of pages||10|
|Publication status||Published - 2012 Jan 1|
|Event||2011 14th Teraflop Workshop - Stuttgart, Germany|
Duration: 2011 Dec 5 → 2011 Dec 6
|Other||2011 14th Teraflop Workshop|
|Period||11/12/5 → 11/12/6|
ASJC Scopus subject areas