Mapping And Optimization Of YOLO Networks On Vector Accelerators

Posted on:2022-12-10

Degree:Master

Type:Thesis

Country:China

Candidate:Y K Zhao

Full Text:PDF

GTID:2558307169983219

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Deep neural networks have achieved great success in processing computer vision tasks and have become a priority solution for image inspection applications.With the continuous improvement of the size and depth of the convolutional neural network,the parameter amount and calculation time of the network model are also increasing,and the traditional general-purpose processor platform is gradually incapable of real-time detection tasks.The urgent need to accelerate neural networks has caused high-performance processors to receive widespread attention at home and abroad,and designing new hardware structures for complex problems has become a new research center.The YOLOv4 network is a target detection network designed by combining a large number of advanced technologies.It achieves excellent performance in both speed and accuracy.Its network model is mainly composed of convolutional layers.It is a typical deep neural network with better performance.High research value.Based on the current research status of convolutional neural network acceleration methods at home and abroad,this paper analyzes the advantages and disadvantages of software acceleration and hardware acceleration,and uses the multi-core vector accelerator as the mapping platform of the YOLO network.The calculation of the convolutional layer includes data multiplication and addition.The vector accelerator supports efficient vector calculation and is suitable for convolutional mapping.The main tasks are as follows:· This paper analyzes the YOLOv4 tiny network algorithm,combined with the M-DSP architecture,and proposes a mapping scheme for the algorithm.Aiming at the characteristics of the multi-core vector accelerator architecture,a data storage scheme for convolutional neural network calculations is designed,and the parallel calculation of the YOLOv4 tiny network on multiple DSP cores is realized.· This paper designs and implements the mapping method of the convolutional layer,pooling layer and sampling layer of the algorithm,and realizes the mapping and multi-core parallel computing of the YOLOv4 tiny network algorithm on M-DSP.Starting from the overall network model,a fusion strategy of multiple network layers is proposed,which reduces the input and output of data and reduces the total execution time.· Based on the M-DSP test environment,this paper has carried out the verification of the design scheme.The experimental results show that the scheme can effectively map the convolutional neural network to the vector accelerator platform and achieve a certain parallel acceleration effect.In the eight-core vector accelerator with a working frequency of 1.8GHz,the mapping scheme has reached a computational efficiency of 29.83%.Compared with the convolution acceleration library Tensor RT of the graphics processor platform,it has achieved a performance improvement of 31.75%.

Keywords/Search Tags:

Multi-core vector accelerator, YOLOv4 tiny, Parallel computation, Algorithm mapping

PDF Full Text Request

Related items

1	Design And Implementation Of VGG Convolution Network On Multi-core Vector Processor
2	Mask Detection Algorithm Based On Improved YOLOv4-tiny
3	Research On YOLOv4-tiny Objection Detection Network Integrated With Transformer
4	Research And Application Of Hierarchical Parallel Algorithm Based On The MPI Environment
5	On Task Mapping Algorithm For Co-optimizing Computation And Communication Performance In Networks-on-chip
6	Research On Parallel Program Performance Tuning On Multi-Core Computing Platform
7	Hardware Accelerator Engine Design For Key Algorithm In Heterogeneous Multi-Core Systems
8	Research On The Design And Implementation Techniques Of Customizing Application Specific Instruction Set Processors
9	The Implementation Of High Performance Hardware Accelerator
10	Design And Implementation Of SpMV Accelerator Based On PIM Architecture