Font Size: a A A

Research On Vectorization Method For SIMD Super-long Vector Acceleration Components

Posted on:2022-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:H H LiuFull Text:PDF
GTID:2518306755960819Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
SIMD extension is widely used in modern high-performance microprocessors as an important way to explore data-level parallelism.In order to meet the demand for higher computing performance from applications in different domains,mainstream general-purpose processors began to use longer vector to enhance general-purpose computing power across domains and large-scale SIMD parallelism.The research on automatic vectorization techniques of very long vector extension for SIMD is of great significance for exploiting data-level parallelism to improve program performance and enhance the cooperative working ability of software and hardware systems.Based on GCC compiler and starting from vectorization method research supported by long vector processing,a series of discussions on SIMD automatic vectorization techniques are carried out as follows:1.A non-full vectorization method ISLP(Insufficient SLP)for basic block is introduced to solve the problem of insufficient program parallelism for long vector length.By filling and replacing redundant data in vector registers,vectorization of superword level parallelism deficient programs is realized.After an in-depth analysis of SLP vectorization framework of GCC,the design and implementation of ISLP in GCC are described from three aspects: parallelism detection,cost model and code generation.2.Aiming at the problem of node redundancy in parallelism detection,a graph-oriented parallelism detection framework was proposed.By reusing SLP node construction records,redundant nodes are eliminated.Aiming at the problems existing in the profit analysis oriented to global vectorization,a new method based on subgraph division is proposed to analysis and schedule by SLP subgraph unit,and realize the vectorization of partial profitable SLP instances when there is no profitable on the whole.Fix scheduling framework to eliminate redundant code generation.3.Interleaving information can be used to assist dependency analysis,but the interleaving information is only used in the same interleaving chain.A cross-interleaving chains dependency analysis method based on interleaving information is proposed for non-affine access on interleaving chains with different read/write types.By simply testing the data reference object,the dependency relationship can be solved quickly.Based on the improved dependency analysis and SLP vectorization framework,the validity of ISLP was verified on benchmark.The average acceleration ratio of selected test cases after vectorization reached 1.14,and the performance was improved by 11.8% compared with the conventional SLP method.
Keywords/Search Tags:very long vector extension, SIMD, vectorization, superword level parallelism, dependence analysis
PDF Full Text Request
Related items