Font Size: a A A

Research And Implementation Of Generalized Inverse Algorithm With Heterogeneous Computing

Posted on:2015-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:X S FanFull Text:PDF
GTID:2180330473953143Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
It is the general picture that the generalized inverse is a fundamental tool in the fields of mathematics. Also it can be widely applied effectively in terms of economics, information processing, automatic control, communications, cryptography, statistics etc. In this sense, improving the performance of the calculation processes of generalized inverse is likely to achieve huge practical value. On the contrary, the traditional CPU architectures which focus on the serial computing, can barely cope with it.Recently, heterogeneous computing, such as computing in OpenCL, develops rapidly, especially in the areas of wide processing as well as cryptography, which brings the acceleration during processing. In fact, generalized inverse can be accelerated and it can benefit from computing in OpenCL.In this paper, we present the implementation of generalized inverse by using OpenCL coding architecture. First of all, a brief presentation on the OpenCL specification is presented and then two platforms including the GPU and the FPGA are analysed in order to figure out the mechanisms of implementation when we decide to build generalized inverse with OpenCL. In addition, as the hardware as well as the mechanism vary a lot between the GPU and the FPGA, different optimized strategies are presented to achieve the well-run designs.In this paper, the number of addition and multiplication are considered as the index to estimate the complexity in three common generalized inverse algorithms. Computational complexity of solving equations method is slightly higher than the other two algorithms. However, through the analysis of realization complexity under heterogeneous computing, solving equations method is significantly better than the other two algorithms in the indicators of the minimum number of tasks, control flow, computing resources, etc. Integrated the analysis results of computational complexity and realization complexity, solving equations method can get better performance with heterogeneous computing. Therefore, this paper design a realization scheme based on solving equations, make multiple modules of scheme parallel, and design appropriate synchronization points to ensure data consistency. Based on OpenCL operating mechanism, the thesis optimizes the data processing and storage accessing to improve computing performance, and verifies it using MATLAB.The scheme is implemented in the GPU and FPGA respectively, and different optimization strategies and testing scheme are applied according to the different architectural in each platform. The test results show that: in terms of the calculation error, the maximum error of GPU is as low as 10-7 level because of its internal high-precision floating point arithmetic units. The error of FPGA platform is relatively larger because its internal multipliers only have 18 bits. The maximum error is in the 10-3 level. Compared with the MATLAB, GPU’s speed-up ratio reached 1909 with the benefit from the sophisticated development and the large amount of computing resources. FPGA only speeded up 34 times, because of the immaturity design methods, but there is a tremendous space for improvement.
Keywords/Search Tags:Heterogeneous Computing, Generalized Inverse Matrix, OpenCL, GPU, FPGA
PDF Full Text Request
Related items