Application based empirical evaluation of multi-core NVIDIA graphics and IBM Cell BE processors
MetadataShow full item record
IBM Cell Processor and NVIDIA CUDA GPUs have been attracting wide interests for application acceleration due to the high performance that can be achieved with its special architecture features. However, obtaining high performance on these platforms requires significant programming efforts and not all algorithms benefit equally from their use. In this work, we present the results of parallelizing three applications, Markov Random Fields-based (MRF) liver segmentation, HMMER's Viterbi algorithm and the classical molecular dynamics simulation. We relate our experiences in porting all the three applications to the GPU and Cell processor as well as the techniques and optimizations that are most beneficial. While certain applications have obtained exceptional speedups, others have done fairly better than the traditional CPU implementations. We examine the unique characteristics that facilitate and hinder efficient use of these platforms by implementations on an NVIDIA 8800 GTX Ultra using the CUDA programming environment and Play station 3. As the architectures of the Cell processor and CUDA GPUs are quite different, they lead to different parallelization strategies. We discuss the difference in the approaches as well as the architectural features that lead to better speedups.