Font Size: a A A

Research On Automatic Driving Target Detection Method Based On Fusion Attention Mechanism

Posted on:2024-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:R X ChenFull Text:PDF
GTID:2542307121488564Subject:Electrical engineering
Abstract/Summary:PDF Full Text Request
With the advancement of science and technology and the continuous development of intelligence,autonomous driving has attracted more and more attention from the state and enterprises.Object detection is a critical task for autonomous driving,which enables vehicles to perceive and understand the environment,detect potential hazards,and make accurate decisions.The purpose of detection is to identify and localize objects of interest such as pedestrians,vehicles,and traffic signs in an image or video stream.In recent years,deep learning-based object detection methods have shown significant performance gains,making them ideal for many computer vision applications,including autonomous driving.These methods typically use deep convolutional neural networks to learn discriminative features from input data,and then use them to classify target objects and predict their bounding boxes.At present,there are still many challenges in the task of object detection for autonomous driving,because these systems need to run in real time,handle diverse and complex scenes,and deal with problems such as occlusion and small object pixels.Any misclassification may lead to potentially catastrophic consequences,so the target detector must have sufficiently high accuracy;at the same time,in order to meet the requirements of model deployment,the parameters and calculation amount of the model must be limited.In order to solve these problems,this paper focuses on the accuracy and lightweight of the automatic driving target detection model.We propose the target detection model AF-YOLO and the model compression method that integrates the attention mechanism,which effectively improves the model performance.The main work and contributions are as follows:(1)We completed the establishment and preprocessing of the BDD100 K data set,and built the basic network of the detection model.We introduced the k-means algorithm to cluster the prior frames,and used Mixup and Mosaic for data enhancement to provide more suitable training and verification data for the network.Then the backbone network of the detection model,the feature fusion network and the detection network are constructed,and the input and output of the entire network are analyzed.(2)We propose AF-YOLO,an object detection model fused with attention mechanism.We have integrated the parallel hybrid domain attention mechanism P_CBAM into the model,and introduced attention mechanism tasks in parallel in the channel and space;for small object detection,we have added a small object detection layer with cross-level fusion features;in order to avoid the model in the original space Due to the loss of feature map information caused by maximum pooling in the pyramid pooling module,we propose a spatial pyramid pooling SPP_dil based on dilated convolutions.In the loss function,we improved GIo U and designed RIo U based on the center distance and aspect ratio of the predicted frame to prevent GIo U from being unable to optimize when the detection frames contain each other.Finally,we conduct comparative experiments using different models and design ablation experiments to verify the contribution of each component.Experimental results show that the average accuracy is improved by 6.2%compared with the baseline model without losing too much detection speed.(3)We propose a joint channel and layer pruning method to compress the model and complete the embedded deployment application.In the sparse training process of pruning,we designed the method of using dynamic regularization to sparse the BN network layer,and preserve the accuracy while sparse.The experimental results show that the joint pruning method reduces the parameters of the AF-YOLO model by about 65.91%,and the GFOPs decrease by about 60.02%.Deploying the model on the embedded edge device Jetson Nano can achieve an average detection time of 35.8 milliseconds per picture.
Keywords/Search Tags:Autonomous Driving, Deep Learning, Target Detection, Attention Mechanism, Model Lightweight
PDF Full Text Request
Related items