Font Size: a A A

Research On Remote Sensing And Aerial Image Object Detection Based On Attention And Transformer Network

Posted on:2024-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:W Q GuanFull Text:PDF
GTID:2542307118983749Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of remote sensing technology,high-resolution remote sensing and aerial images are becoming more and more available,more and more accurate,and more and more detailed.Remote sensing and aerial images object detection has become a new research field in computer vision.Nowadays,there has been a great breakthrough in traditional object detection algorithms of oridinary image data.However,due to the small targets,complex background,large scale variation,and densely distribution of remote sensing and aerial images,how to apply these n traditional algorithms to remote sensing and aerial images has become a new research field in deep learning.Currently,most of them are based on traditional horizontal box object detection algorithm,which can be classified into two kinds:one is CSL and R~3Det based on single-stage and the other is Ro I Transformer and SCRDet based on multi-stage.Based on the existing models,we apply Transformer and Attention technologies to improve the performance of the model in the field of remote sensing and aerial images.The main research work is as follows:1)Aiming at densely distribution,large scale variation and huge orientation variations of remote sensing and aerial images,we propose a Single-stage Remote Sensing and Aerial Image Object Detection with Feature Enhancement using Hybrid Attention Network.Firstly,we design Transformer structure both with local and global attention in the backbone network to enhance dense targets feature extraction ability.The Transformer structure uses attention to suppress background noises and make dense target boundaries clearer.Secondly,A Spatial Pyramid Pooling Block using continuous Avg Pooling and Max Pooling is adopted to enrich feature information and enhance the multi-scale target representation.Moreover,A Feature Reconstruction Module mixing Cross-scale Spatial attention and non-local Channel attention is designed to reconstruct the feature pyramid network.It can effectively integrate the spatial dependence and channel dependence between any two positions in the feature maps of different scales,so as to reduce unnecessary information interference and facilitate multi-scale object detection.Finally,the Circle Smooth Label is introduced to realize rotated object detection.2)Aiming at the large number of small targets and complex background of remote sensing and aerial images,we propose a Multi-stage Remote Sensing and Aerial Image Object Detection based on Hierarchical Transformer Network.Firstly,Swin Transformer is used to enhance feature extraction ability and suppress background noise.Secondly,a Feature Fusion Network based on Transformer using cross attention is proposed to establish semantic mapping between different levels and improve the fusion of small-target by capture context information of different levels.Thirdly,because the proposals extracted by the Region Proposal Network are easy to introduce noise and increase the rate of missing detection and false detection,a Multi-field Channel Attention Network is used to make up Transformer’s neglection of channel information.Moreover,the Multi-field Channel Attention Network can learn the importance of target features in different fields adaptively and enhance the boundary information of every target,so as to capture small target features against complex backgrounds.The proposed models in this article are extensively compared to other state-of-the-art remote sensing and aerial image object detection models on the DOTA、HRSC2016and RSDD datasets.It is proved that their performance is significantly improved compared with other models.The evaluation index m AP of Object Detection with Feature Enhancement using Hybrid Attention Network(HA-Net)reached 77.04%and78.28%respectively on single test and multi-scale test on the DOTA.The evaluation index m AP of Object Detection based on Hierarchical Transformer Network(HT-Net)reached 77.87%and 79.25%respectively on single test and multi-scale test on the DOTA.The evaluation index m AP of HRSC2016 reached 89.95%and 90.12%respectively on the HA-Net and HT-Net.The evaluation index m AP of RSDD reached89.38%and 90.00%respectively on the HA-Net and HT-Net.The improvement of experimental results proves the effectiveness of HA-Net and HT-Net in remote sensing and aerial image object detection.
Keywords/Search Tags:Remote Sensing and Aerial Images, Rotation Object Detection, Transformer, Attention Mechanism
PDF Full Text Request
Related items