Griffon – GPU Programming APIs for Scientific and General Purpose Computing (Extended Version)

Pisit Makpaisit, Worawan Marurngsith


Applications can accelerate up to hundreds of times faster by offloading some computation from CPU to execute at graphical processing units (GPUs). This technique is so called the general-purpose computation on graphic processing units (GPGPUs). Recent research on accelerating various applications by GPGPUs using a programming model from NVIDIA, called Compute Unified Device Architecture (CUDA), have shown significant improvement on performance results. However, writing an efficient CUDA program requires in-depth understanding of GPU architecture in order to develop a suitable data-parallel strategy, and to express it in a low-level style of code. Thus, CUDA programming is still considered complex and error-prone. This paper proposes a new set of application program interfaces (APIs), called Griffon, and its compiler framework for automatic translation of C programs to CUDA-based programs. Griffon APIs allow programmers to exploit the performance of multicore machines using OpenMP and offloads computations to GPUs using Griffon directives. The compiler framework uses a new graph algorithm for efficiently exploiting data locality. Experimental results on a 16-core NVIDIA Geforce 8400M GS using six workloads show that Griffon-based programs can accelerate from 1.5 up to 89 times faster than their sequential implementation running on CPU.


CUDA, Accelerating Computing, GPU, Automatic translation, Parallel Programming.

