Font Size: a A A

Object Detection And Application Based On Dilated Convolution And Visual Attention

Posted on:2021-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:A PingFull Text:PDF
GTID:2392330614465773Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
As an important part of scene understanding,object detection technology has a wide range of applications in areas such as smart security,industrial manufacturing and military detection.In recent years,with the increase in computer computing power and the widespread use of neural networks,a variety of algorithms have been proposed one after another,which has greatly facilitated the object detection techniques have evolved.However,the actual application scenarios are complex and varied,with targets either small and dense,or with large scale variations,or highly obscured from each other.These factors can have an impact on the detection performance and eventually lead to the phenomena of false detection and missed detection.To address the above problems,this paper uses YOLOv3 network as the main detection framework,combined with expansion convolution and visual attention mechanism.and proposes an improved solution.The main tasks are as follows.(1)Aiming at the problem of large changes in the target scale resulting in the reduction of detection performance,a YOLOv3 network based on the fusion of dilated convolution and feature map is proposed.First of all,using dilated convolution to show the characteristics of different receptive fields at different expansion rates,a parallel multi-branch dilated convolution module is constructed to improve the backbone network Darknet53 so that the extracted features contain richer information.Secondly,by upsampling and tensor stitching the output features of the network,the number of shallow feature maps is increased to improve the detection accuracy of small targets.Finally,a comparative experiment is performed on the COCO dataset.The results show that the improved network can effectively improve the detection effect,and the detection accuracy of various targets is significantly improved.(2)Aiming at the problem that the feature imbalance in the feature map affects the detection effect,a YOLOv3 network based on feature balance based on visual attention is proposed.First,the feature map output by the feature pyramid is integrated through upsampling and downsampling.Secondly,a non-local module and a squeeze excitation network are used to construct a channel non-local unit,and the long-range dependency relationship between features is obtained from the spatial domain and the channel domain,and the context information of the target is enhanced.Once again,the original size of the feature map is restored for detection of targets of different scales.Finally,experiment on the data set COCO.The results show that the detection effect of the improved network has improved significantly.(3)Aiming at the detection of pedestrians and vehicles in practical application scenarios,an M-YOLOv3 network based on dilated convolution and visual attention is proposed.First,fully consider the impact of weather,scenes,and illumination on target detection,and create a pedestrian and vehicle data set NJPV with urban characteristics.Secondly,for the problem of unbalanced samples in the data set,category balanced sampling and focus loss functions are used to deal with unbalanced samples from the data perspective and algorithm level.Finally,the detection model is applied to real life scenarios.The results show that the M-YOLOv3 network can effectively reduce the false detection rate and missed detection rate of pedestrian and vehicle detection in complex scenarios,and has high practical application value.
Keywords/Search Tags:YOLOv3, Dilated Convolution, Feature Map Fusion, Attention Mechanism, Pedestrian and Vehicle Detection
PDF Full Text Request
Related items