| Remote sensing target detection technology plays an important role in the fields of national defense,military and civilian economy.However,achieving high-precision and real-time detection of targets in optical remote sensing images still faces many challenges.For example,how to mine the deep features of remote sensing targets at different scales in complex scenes to solve the problem of unsatisfactory remote sensing target detection;how to design a reasonable position loss function to solve the problem of inaccurate positioning of remote sensing targets;how to reduce the complexity of the algorithm while ensuring that the detection accuracy meets the standards to achieve fast detection of remote sensing targets.In view of the above problems,this paper analyzes the relative data characteristics of the target and background in the remote sensing image and explores the remote sensing target detection technology with high accuracy and low complexity based on the optical remote sensing target detection technology completed by the research group.In addition,this paper evaluates the proposed algorithm in terms of detection accuracy and complexity by testing multiple real remote sensing datasets.The innovations of this paper are summarized as follows:(1)This paper proposes a multi-scale feature fusion remote sensing target detection algorithm named MFFD.Design the fully convolutional neural network FCN-13 to fully extract the spatial and semantic features of remote sensing targets,then design three independent detection layers to detect targets of different scales,finally fuse the shallow and deep features to improve the detection accuracy of small targets.The m AP of the MFFD algorithm on the NWPU VHR-10 dataset and RSOD dataset is almost consistent with the advanced YOLOv3 algorithm,but the detection speed is ahead of the YOLOv3 algorithm,reaching 160 FPS on the GPU and 4FPS on the CPU.(2)In order to further improve the detection performance,GIo U is used instead of L2 norm as the position loss function of the MFFD algorithm.GIo U is an improvement of Io U that can better reflect the overlap of two bounding boxes and has scale invariance in order to better constrain the MFFD algorithm to find the optimal solution during the training process.By testing the NWPU VHR-10 dataset,the GIo U position loss function can improve the m AP of the MFFD algorithm by 1.46% without reducing the detection speed.(3)In order to remove redundant parameters and reduce the amount of calculation so that the algorithm is finally applied to actual engineering,a depthwise separable convolution is introduced and a depthwise inverted residual structure is designed to remove the redundant parameters of the 3 × 3 standard convolution of the MFFD algorithm.This paper refers to the above algorithm as MFFD-DIR.The calculation volume of the MFFD-DIR algorithm is reduced to 1.754 BFLOPs and the amount of floating-point parameters is reduced to1686560.Then introduce the idea of binary quantization and rationally quantify the weights of some point-by-point convolution layers in the MFFD-DIR algorithm to reduce the amount of floating-point parameters.It is named the Light-MFFD algorithm.The calculation volume of the Light-MFFD algorithm is reduced to 1.046 BFLOPs and the amount of floating-point parameters is reduced to 932928.The detection effect of the Light-MFFD algorithm on the NWPU VHR-10 dataset is ahead of the classic target detection algorithms COPD,SSD and Faster R-CNN.By testing the NWPU VHR-10 dataset,the m AP of the Light-MFFD algorithm reaches 87.05% which is 0.72% ahead of the YOLOv3 algorithm.The Light-MFFD algorithm is not effective in detecting overpass targets on the RSOD dataset,but the APs detecting the remaining targets are above 87.28%.The detection speed of the Light-MFFD algorithm reaches 130 FPS on the GPU which is 2.5 times that of the YOLOv3 algorithm and9 FPS on the CPU which is 30 times that of the YOLOv3 algorithm. |