| In recent years,with the rapid development of optical imaging satellite technology,more and more satellites equipped with high-resolution sensors have been successfully launched and a large number of high-quality remote sensing satellite images have been acquired.These image data contain a lot of ground object position information,including airplanes,ships,ground vehicles,and so on.If you can accurately grasp these location information,it will be of great significance for military and civilian use.At the same time,the remote sensing images taken by satellites are more inclined to ports,airports,etc.The objects in the images have the characteristics of dense arrangement,large size changes,and arbitrary directions,which increases the difficulty of target detection.Therefore,the problem of target detection based on remote sensing images has also received extensive attention and research from scholars at home and abroad.The essence of object detection is to use image processing-related algorithms to obtain image features,and determine the location information of the target based on the feature information.The current target detection algorithms usually include the task of target recognition.The current research methods of deep learning are mainly divided into traditional object detection algorithms and deep learning-based target detection algorithms that have performed well in recent years.In the current object detection method,R2CNN++first proposed the method of adding rotation angle prediction to the target frame regression to solve the problem of detecting the target at any angle.However,the R2CNN++method still needs to be improved in terms of detection speed and accuracy in remote sensing image target detection.Therefore,this paper makes further improvements to the R2CNN++network based on the lack of accuracy and speed of the R2CNN++network in remote sensing images.There are many redundant calculations for its network,which leads to its slow detection speed.This article first analyzes the most time-consuming part of the network,that is,the backbone network uses the simplified and faster DarkNet53 to replace the original ResNet101,and the final detection effect is the effect quite,which further improved the detection speed.Secondly,this paper studies and designs a non-local spatial attention model,which is successfully embedded in the DarkNet53 to expand the receptive field,increase context information,and further improve detection accuracy.Finally,analyze the deficiencies in the R2CNN++detection network and improve the detection network in two aspects.On the one hand,improve the IF-Net structure,increase the information fusion at different stages,and improve the detection of large-scale targets;on the other hand,analyze the problems of the loss of attention,increase the mask to improve the loss function,balance its loss for different scale targets Improve target detection accuracy.This article is based on a RTX2080 GPU platform,using ImageNet and DOTA dataset to evaluate the performance of the model.The experimental results show that compared with the R2CNN++network before the improvement,the speed of the final model in this paper is greatly improved.Compared with R2CNN++,the speed is increased by 53.8%.And through the final experimental results and analysis,it can be seen that the target scale of its detection has also increased 3.11%.At the same time,after embedding the non-local attention model proposed in this paper,its detection accuracy has been effectively improved. |