| In recent years,with the continuous development of deep neural networks,object detection algorithms have achieved high accuracy in detecting large and medium-sized targets.However,due to the relatively small area of small targets in the image,low pixels,and limited availability of features in the detection network,the detection of small targets has problems of inaccurate classification and inaccurate positioning.Object detection has become a core technology in many fields,such as autonomous driving,medical image detection,aerial image recognition,and is also the cornerstone of many computer vision tasks.At present,the general object detection model applied in small target scenarios has not achieved ideal results.To solve the above problems,this article mainly studies from several aspects.(1)In order to enable the detection network to extract more feature information of small targets,this paper integrates a super-resolution reconstruction network that relies on generative adversarial algorithms in the target detection network to improve the resolution of small target features.Therefore,this article improves the ESRGAN super-resolution network.On the basis of the RRDB dense block used by ESRGAN,add an additional residual learning layer,and add a residual in every two layers of each dense block,so as to enhance the understanding of the model without increasing complexity,so as to achieve better robustness.Simultaneously introducing Gaussian random noise and relative discriminator,the network can benefit from both real images and generated images,thereby enabling the generator to generate images with richer texture details.(2)In the architecture of convolutional neural networks,there is a common design flaw for small target objects,which is that using convolutional step size or pooling operations can have a negative impact on convolutional layers of different depths.In the early layers of convolutional neural networks,the object size was moderate,the resolution was good,and there was a large amount of redundant pixel information.Step convolution or pooling operations could be easily skipped,and the model could still learn features well.Therefore,the negative effects of this design are usually not apparent.However,in more difficult tasks with blurred images or small objects,the assumption of a large amount of redundant information is no longer valid,and the current design is beginning to suffer from fine-grained information loss and insufficient learning features.To address the above issues,this article uses the SPD-COnv module to replace the step convolution operation in the YOLOv5 object detection network,downsampling the feature map but preserving all information in the channel dimension,thus no information loss is caused,resulting in a significant improvement in the accuracy of small object detection.(3)Finally,a CA attention module was added to the YOLOv5 object detection network that integrates super-resolution reconstruction network and SPD-COnv module.Applying the CA attention module to the feature extraction module of the Yolov5 network,the position of each multi-scale feature output is used to recalibrate each channel of each feature map at all scales,thereby improving the feature extraction ability of the target detection network for small targets.The experiment showed that compared to the original YOLOv5 network,the m AP(%)on the Vis Drone2019 dataset increased by 8.58 percentage points.Finally,the ablation experiment proved that the three modules used in this article have a certain improvement in the detection effect of small targets. |