Font Size: a A A

Design Of Reconfigurable Convolutional Neural Network Hardware Accelerator System For Hand Detection

Posted on:2024-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:J H OuFull Text:PDF
GTID:2558307067993619Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Convolutional neural network has promoted the widespread application of hand interaction in the fields of smart home and VR games due to its excellent recognition accuracy and robustness.As an intensive algorithm with high computational complexity and large computation,the deployment of hand detection algorithms based on convolutional neural network on terminals faces challenges in high performance and low power.In addition,complex application scenarios and variable model structures also pose higher requirements for the compatibility and scalability of neural network accelerators.To address the above issues,this paper designs and implements a reconfigurable,low latency neural network accelerator circuit to implement efficient real-time hand detection.The main work and innovation are as follows:1.Complete the lightweight design of YOLOv3-tiny algorithm for hand detection applications.1)A method of selecting prediction frames in advance is proposed to reduce the computation of YOLO layers.Meanwhile,a channel pruning scheme is also used to optimize the network structure.After optimization,the computation is reduced by nearly 56.1%,and the m AP is only decreased by 4.5%.2)Two schemes of post-training quantization and quantization-aware training are used to quantize the algorithm with 8-bit fix-point,and the results show that the m AP of the model can reach 0.607,and the data capacity is compressed by 75%,which effectively reduces the parameter storage capacity.2.A low-latency,reconfigurable hardware accelerator circuit is designed and implemented based on a single-engine architecture.1)A hybrid strategy data traversal scheme adapted to different computational modes is proposed.The computation efficiency can reach 92.51%,and the data interaction latency is effectively reduced.2)A highly parallelized reconfigurable systolic array is proposed,and an array control scheme for conv1 and conv3 is designed,with computational parallelism of 576 and512,respectively.The average utilization of PE is 86.87% and the peak utilization is96.58%,which achieves efficient forward inference.3.The reconfigurable functions,inference accuracy and system performance of the proposed accelerator are verified and evaluated by designing a convolutional neural network simulation case,building a Matlab hardware model,and constructing a So C test platform.The results show that the accelerator supports both conv1 and conv3 computation modes,with model inference time of 22 ms,mean square error of0.0111,effective arithmetic power of 153.18 GOPs,and energy efficiency of29.42GOPs/W.In summary,the YOLOv3-tiny accelerator circuit designed and implemented in this paper has configurability,high energy efficiency,and high real-time performance,which is suitable for fast-moving hand detection application scenarios.
Keywords/Search Tags:Hand detection, YOLOv3-tiny, Algorithm optimization, Hardware acceleration, FPGA
PDF Full Text Request
Related items