| Recent years, with the rapid improvement in performance of the modern Graphic Processing Unit (GPU), the performance of GPUs have already break up the Moore's Law. Due to the inherent parallel character and the increasing programmable ability, it has been under the limelight that applying GPUs to the generic parallel computation.In this thesis, we firstly introduced the basic theory and the architecture of GPUs. At the same time, we also discussed OpenGL and DirectX, the two popular graphic interfaces. Besides, we summarized the procedure of the generic parallel computation using GPUs. Being a kind of typical high-level graphic programming language, Brook GPU overcomes the difficulties of programming GPUs directly, which often requires dedicated professional knowledge of graphic hardware.Secondly, we investigated the design principles and the architecture of Book GPU as well as the compiling and running procedure of Brook program. Because the way of data mapping is the premise of Brook programming, we analyzed the data mapping methods with emphasis and validated its effectiveness with related experiments.Then, we deeply studied the way to design GPU-based parallel algorithm based on Brook with an example of FFT. Current implementations of parallel FFT algorithm do not meet the single instrument multiple data flow character of GPUs and can not exchange data between the parallel units. We presented a parallel FFT algorithm tailored for GPUs based on the deep investigation of the FFT. Our algorithm and along with its design philosophy provide both theoretical principles and practical guidelines for applying GPUs to the field of generic parallel computing.Lastly, we evaluated the parallel computing ability of GPUs using Brook GPU. The experimental results show that the computation efficiency has little improvement on the small data size problems needing frequent data transfers which are time-consuming tasks. However, when it comes to the computation intensive problems with large data set, the efficiency of our algorithms on GPU is greatly improved compared to that of those on CPU. |