Font Size: a A A

Research On Convolution Neural Network Accelerator For Edge Computing

Posted on:2024-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z MuFull Text:PDF
GTID:2568307061481824Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of the intelligent Internet of Things,edge computing as a new computing method has been applied in various fields with the advantages of ultra-low delay and ultra-high reliability.However,the traditional CPU and GPU platforms can not meet the real-time and power consumption requirements of edge computing,and an efficient and low-power platform is urgently needed to accelerate the application.Therefore,this thesis designs and implements an accelerator based on a heterogeneous multi-core FPGA platform and explores and optimizes the design space to meet the above requirements.The main research work of this thesis is as follows:1.Data read-write dependency is common in convolution operation,which makes the execution process unable to be completely parallel,and serial task calls will lead to inefficient execution of the accelerator.To solve this problem,a convolution neural network accelerator architecture is designed to eliminate the impact of read-write dependence and serial call defects on performance.This architecture combines task-level and operation-level parallelism,solves the defects of single hardware parallel architecture,and completely parallelizes the processing flow.The architecture consists of a processing system responsible for task scheduling and memory management and an FPGA acceleration core for multichannel convolution operation.In the aspect of task-level parallel processing,using the characteristics of a multi-core CPU,the task flow is distributed to multi-core for processing through multi-process technology,and the task flow is parallel.In the aspect of arithmeticlevel parallel processing,an arithmetic-level parallel processing unit based on the pulsating array is designed to streamline the computing process.Through loop unwrapping technology,the convolution operation is accelerated by utilizing the intra-layer parallelism and interlayer parallelism of the convolution operation.At the same time,the storage unit adopts a multi-level storage structure to increase data reuse in the sliding window.2.The complex system structure of heterogeneous platform convolution neural network accelerator,more design parameters,and stricter constraints are needed,which leads to the complexity of design space.To solve this problem,this paper proposes a design space exploration and optimization method based on the execution time of the deep learning model to obtain the optimal design parameters.Firstly,the implementation process of the deep learning model is analyzed and modeled.The mathematical model takes the total execution time of the depth model as the objective function and takes the value of decision variables and resources as the constraint conditions.At the same time,to obtain a better optimization effect and convergence speed of design parameters,an adaptive differential evolution algorithm based on a weighted mutation strategy is proposed.By using a weighted fusion mutation strategy and adaptive parameter adjustment strategy to dynamically balance the ability of global search and local search,the defect that the differential evolution algorithm is easy to fall into the local optimum is solved,to achieve a better optimization effect.3.To verify the performance and effect of the accelerator in edge computing application scenarios,an embedded intelligent target detection system is designed.The video stream acquisition and acquisition module and video output module of the system are implemented by ARM CPU,and the deep learning inference stage is implemented by FPGA.Memory management and task scheduling on the software side are developed and implemented based on the Linux operating system.The function of the accelerator is verified by image classification and target detection tasks.The accelerator can quickly complete image classification and moving target detection,which meets the system design requirements and achieves real-time low-power processing of edge computing scenes.
Keywords/Search Tags:Convolutional neural network, Accelerator architecture, Design space exploration, Differential evolutionary algorithms
PDF Full Text Request
Related items