Research On Small Target Detection Method Of Remote Sensing Image Based On Residual Convolutional Network And TRANSFORMER Fusion

Posted on:2024-01-30

Degree:Master

Type:Thesis

Country:China

Candidate:J L Wei

Full Text:PDF

GTID:2542307142452324

Subject:Computer technology

Abstract/Summary:

Object detection is one of the current popular tasks,which requires accurate semantic detection of images in order to achieve full analysis and utilization in the fields of military and civilian applications such as security,transportation,and rescue.However,due to the different perspective presented by remote sensing photos compared to the normal front or side views in everyday life,but rather presented in a bird’s-eye view,the objects produced in this perspective are smaller in scale and have inherent differences in orientation.When applying object detection algorithms directly from natural images to satellite images,the results are often poor.Therefore,small object detection in remote sensing images has been a focus of attention and research in both industry and academia.This article mainly focuses on the problems of low information proportion and difficult feature extraction of small objects in remote sensing images.Based on YOLO X,it proposes an improved Attention Cross-stage Transformer network（ACSTNet）and a Bidirectional Attentional YOLO X network（BAM-YOLO X）small object detection algorithm.The specific research contents are as follows:（1）To address the problem of insufficient feature information of small objects in remote sensing images,this paper adds Patch Partition to the upper layer of the backbone network,which makes the block map of each layer more detailed,cross-window connection and shifted window strategy improve the efficiency and the scope of the sensing domain,and more feature information of small objects is passed to the subsequent layers through the residual module,which enables the model to perform more intensive potential semantic exchange and increase the depth of interaction information at different levels.At the same time,a new feature output branch of 160px×160px is added to the upper layer of the dark3 network for enhancing the upper layer of low-level feature information rich in small object features,which can effectively improve the detection accuracy of small objects.Then,to address the problem of complex environmental information in remote sensing images and inconspicuous distinction between front and back backgrounds,this paper designs a parallel Swin Transformer structure to increase the depth interaction information of different kinds of feature extractors,combining the feature of convolutional neural network which is more sensitive to local feature information of images with the feature of Transformer structure which is more sensitive to the relationship information between pixels in images and global information extraction.The combination of the features of convolutional neural network,which is more sensitive to the local feature information of the image,and the Transformer structure,which is more sensitive to the relationship information between pixels in the image and global information extraction.It is demonstrated that the fusion of the convolutional neural network and the self-attentive mechanism designed in this paper significantly improves the differentiation of foreground and background by the model.（2）To address the problems of remote sensing object detection tasks with huge differences in the proportion of objects in the annotation frame,redundant background information,and most small objects presenting dense clusters that are difficult to distinguish,this paper proposes a efficient channel and space normalized fusion attention mechanism（ECSNFAM）based on the fusion of spatial,channel and batch normalized empirical weights as soft attention,which combines the feature mapping information at the neck level to better focus on the feature information of the object being detected.It is in response to the ECSNFAM structure that increases the computational effort substantially and lacks global information,this paper adds a double-ended attention module in the shallow layer of the neck network,which combines the ability of the Transformer model to extract the pixel-to-pixel relationship of the object to compensate for the lack of global focus of the convolutional neural network model on the object,and uses only two attention branches to reduce the ECSNFAM The number of parameters of the pixel-level attention mechanism is reduced,and the accuracy of the model is improved.Experimental evaluations were conducted on the DIOR and RSDO-DATA remote sensing datasets to assess the effectiveness of our proposed method.The results indicate that our method outperforms the YOLO X model,achieving an improvement of 1.2%and 1.4%in mAP_0.5on the two remote sensing datasets,respectively.

Keywords/Search Tags:

Small object detection, Self-attention mechanism, Residual convolution, Transformer, Convolutional attention mechanism

Related items

1	Research On Hyperspectral Image Classification Models Based On Hybrid Convolution And Attention Mechanism
2	Object Detection And Application Based On Dilated Convolution And Visual Attention
3	Bearing Fault Diagnosis Based On Hybrid Domain Attention Mechanism Convolutional Network And Residual Contraction Network
4	Research On Traffic Lights Detection Based On Convolutional Neural Network And Attention Mechanism
5	Small Object Detection Algorithm From The UAV Based On Attention Mechanism And YOLOv5 Network
6	Research And Application Of Driver Handheld Call Detection Algorithm Based On Convolutional Neural Network
7	Research On Object Detection Algorithm For Traffic Scenes Based On YOLOv4 Optimization
8	Research On Traffic Sign Recognition Based On Improved YOLOv3
9	Research On Remote Sensing Image Fusion Algorithm Based On Residual Network And Attention Mechanism
10	Research On Hyperspectral Image Classification Based On 3D Convolutional Neural Networks And Attention Mechanism