Automatic tuning of CUDA execution parameters for stencil processing

研究成果: Chapter

8 被引用数 (Scopus)

抄録

Recently, Compute Unified Device Architecture (CUDA) has enabled Graphics Processing Units (GPUs) to accelerate various applications. However, to exploit the GPU's computing power fully, a programmer has to carefully adjust some CUDA execution parameters even for simple stencil processing kernels. Hence, this paper develops an automatic parameter tuning mechanism based on profiling to predict the optimal execution parameters. This paper first discusses the scope of the parameter exploration space determined by GPU's architectural restrictions. To find the optimal execution parameters, performance models are created by profiling execution times of kernel using each promising parameter configuration. The execution parameters are determined by using those performance models. This paper evaluates the performance improvement due to the proposed mechanism using two benchmark programs. From the evaluation results, it is clarified that the proposed mechanism can appropriately select a suboptimal Cooperative Thread Array (CTA) configuration whose performance is comparable to the optimal one.

本文言語English
ホスト出版物のタイトルSoftware Automatic Tuning
ホスト出版物のサブタイトルFrom Concepts to State-of-the-Art Results
出版社Springer New York
ページ209-228
ページ数20
ISBN(印刷版)9781441969347
DOI
出版ステータスPublished - 2010 12 1

ASJC Scopus subject areas

  • 工学(全般)

フィンガープリント

「Automatic tuning of CUDA execution parameters for stencil processing」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル