Object detection is an essential task of remote sensing image(RSI)analysis and understanding,a prerequisite for tasks such as object segmentation,object tracking,and status monitoring.In recent decade,with the rapid development of deep learning theory and technology,deep learning-based methods have vigorously promoted the development of object detection in RSIs.Remote sensing images have characteristics such as a large field of view,complex background,low ground sample distance,and bird’s-eye view,which lead to many difficulties in object detection in RSIs.Firstly,due to the large observation field of view and low ground sample distance,many objects of interest occupy only a small number of pixels and appear as small objects.Secondly,due to the bird’s-eye view in the remote sensing imaging process,the orientation of objects in the image is arbitrary.The traditional way of locating objects by horizontal bounding box cannot locate objects tightly and lead to miss detection.Finally,due to the limitation of the return period of the remote sensing platform and the rarity of some objects of interest,it is not easy to collect enough object samples.Meanwhile,the object detector trained on enough labeled samples is often ineffective with few training samples.To this end,this dissertation addresses the above problems faced by object detection in RSIs and conducts research on small object detection,oriented object detection,and few-shot object detection,includes the following three areas of work:(1)A feature super-resolution-based method is proposed for the small object detection problem in RSIs.First,Res Net is adopted to extract deep semantic features and shallow features of the target,and the strategy of deformable convolution fuses the two.The fused features are used as the input of the center-based detector to detect the suspected target area.Second,the full-image feature map and the binary mask of the suspected target area are used as input of the feature super-resolution network to enhance the features of these areas.Finally,the detection head is applied to the super-resolved features to detect small objects.Experiments are carried out on three remote sensing datasets(UCAS-AOD,COWC,and VEDAI),and the results show that our method can achieve significant performance with a mean average precision(m AP)of 96.5%,94.4%,and 76.1%,respectively.(2)An object detection method based on an anchor-free region proposal network(RPN)is proposed for oriented object detection in remote sensing images.First,to suppress the semantic ambiguity of the pixel in feature maps,a cascaded criss-cross attention mechanism is adopted to mine the global context information,to enhance the feature representation.Second,a novel oriented bounding box representation based on the polar coordinate system is proposed to achieve accurate bounding box regression.Moreover,an anchor-free oriented region proposal network is designed to generate rotated proposals.Finally,the oriented detection head is adopted to obtain the final detection results.The proposed approach can achieve promising detection performance,with an m AP of 76.57%,67.15%,and 90.45%,respectively,on the DOTA,DIOR-R,and HRSC2016 benchmarks.(3)A meta-learning-based method is proposed for the problem of few-shot object detection in remote sensing images.First,in order to extract the similar common features of the few samples,a knowledge learning network based on information bottleneck is designed to obtain the common feature expressions of the same category.The loss function is theoretically derived and designed based on the principles of information bottleneck theory.Second,a dual-attention-based RPN is proposed.A self-attention mechanism and knowledge-based external attention are used to enhance the features of the RPN.Finally,the features of the proposal region are aggregated with the knowledge to enhance the proposal features,and the enhanced features are the input of the target detection head.The proposed method can achieve promising detection performance,with an average m AP of 23.9% and 55.4%,respectively,on the DIOR and NWPU VHR-10 datasets. |