| With the development of the semiconductor technology,the number of transistors and performance of a chip have increased by Moore’s law.But the increasing power consumption becomes an obstacle for the development of processor.A brilliant way to improve the computing efficiency and reduce the power consumption for a given application is to design hardware accelerator.Fast Fourier Transform(FFT)is the most time-consuming and key algorithm in the field of Digital Signal Processing(DSP).It has been widey used in the fields such as phonics,graphics,radar,wireless communication signal processing.The computing performance of FFT algorithm will have a great impact on whole system performance.This paper studies the design and vertification of FFT accelerator unit in an DSP chip——X-DSP,the details are as follows:1.Based on radix-2 FFT and Cooley-Tukey algorithm,a FFT accelerator structure in X-DSP is put forward.The main part of the accelerator includes control module,databus controller and calculating array with two FFT-PEs.Each FFT-PE can fulfill radix-2 FFT computation which is below 1K(minimum scale)alone,and could complete large scale(range from 1k to 1M)FFT task by two quantities of minimum scale FFT and matrix transpose.2.A butterfly unit that supports complex multiplication is designed and optimized.The multiplexing circuit structure is used to realize butterfly computing butterfly computing and support complex multiplication with IEEE-754 float standard.It can reduce the delay of normalization operating,reduce the hardware cost and improves the computing accuracy.3.To solve the problem that the twiddle factor generates slowly in FFT accelerator,a new low delay CORDIC algorithm structure based on rotation prediction method and CSA(Carry Saved Adder)is realized.The experiment shows that the new method increases the area by 5% and reduced the pipeline cycles from 49 to 18 comparing with traditional CORDIC algorithm under the same accuracy.4.The hierarchical verification method is used for FFT accelerator.First,golden models are built for butterfly and CORDIC unit to compare RTL verification results.Then an automatic verification platform for system verification is built.This platform can reduce artificial work and mistakes caused by artificial operation and improve the efficiency.Finally,the performance of the FFT accerator under the automatic verification platform is evaluated.The experiments shows that the structure proposed in this paper can acquire 3.82 to 4.38 times performance upgrade comparing with TI’s DSP.Comparing with software FFT implementation in Intel Xeon CPU,the performance of the FFT accerlaerator has improved by two orders of magnitude. |