| In the field of information security, the high throughput computing of hash function has very important practical value of many applications. Basing on the investigation of four byte hash functions on the principle, this paper has made a further reasearch of the realization and the optimization of accerlarating the byte hash functions using the high computing speed features of Intel MIC architecture, and this can effectively enhance the computing throughput rate of the hash functions.First, this paper introduced the Intel MIC architecture and the principle of byte hash functions from these two related aspects. MIC(Many Integrated Core) coprocessor is a new type of equipment of accelerating the computing speed launched by Intel in 2012, it contains about 60 cores which support the x86 instruction set, each core has a 512 bit vector processing unit(VPU). The MIC coprocessor can also supports a variety of parallel programming models. Then, the paper made a detailed analysis of four kinds of common byte hash algorithm of RC4, Domino, UNIXDES, Oracle710, and pointed out two different characteristics of the hash algorithm with the 32 bit intensive computing hash algorithm like MD5: 1) the basic calculation is based on the operation of byte computing; 2)byte hash funtion contains a large number of table look-up operations, rusulting in the low rating of the calculating instructions compared with memory instructions.Above all, this paper proposes to use the parallel method of thread level and data level, this two levels of parallel computing can enhance the computing throughput of byte hash function. This paper will use Open MP to complete the development of the parallel computing of thread level, and the VPU embedded primitives to complete the development of the parallel computing of data level. According to the particularity of byte computing, this paper designed and compared three schemes of different storage format on VPU through an example of the algorithm, and then select the optimal storage format. At the same time, this paper also optimizes the look-up table operation using the gather memory embedded primitive, and designed a new method of conversion between two storage formats, and designed the optimization strategy of the table storage for the UNIXDES algorithm.The last part of the paper is testing the correctness and the performance of the four types of hash algorithms. The testing result shows that the permance of the implementation on MIC can be faster than the classic CPU(John the Ripper) about 100 times, which will achieve the expected goal. |