| Remote sensing images have more and more important value in different fields such as land and resources management,environmental monitoring,meteorology,and military.Remote sensing images contain rich information,densely arranged targets,and large scale changes,making it difficult to identify them,the current mainstream target detection algorithms based on deep convolutional neural networks can be used to identify targets in remote sensing images.However,in order to extract deeper features of data,these networks gradually deepen the network structure of neural networks,and require more and more computer resources and storage devices.In the face of remote sensing images with complex backgrounds and huge amounts of data,how to achieve lightweight models and speed up algorithm training and detection has become one of the current research hotspots.This thesis focuses on how to reduce the weight of the target detection algorithm model while ensuring the accuracy of remote sensing image target detection.It mainly includes the following parts:(1)Construct a lightweight target detection model GM-YOLOv3.Target detection algorithm has many parameters,large volume,slow detection speed,redundancy of feature maps in the process of convolution,and poor detection effect for small targets.Starting from reducing the parameters of the convolutional neural network and improving the detection of small targets in remote sensing images,combined with the artificially designed lightweight neural network structure model Ghost Net and the residual structure idea,redesign the backbone network of YOLOv3,and propose a lightweight target detection model GM-YOLOv3 algorithm.First,the case where the GM-YOLOv3 obtained using a conventional convolution operation image intrinsic features,using linear operation followed by the intrinsic characteristics of the image further manipulated to give more ghost image.Finally,without changing the size of the feature map,the depth of the feature of the image extracted.At the same time,in order to fit the remote sensing image data set and transport the information more smoothly into the network,the k-means clustering algorithm is used to adjust the dimension and number of candidate frames in the prediction stage,and use the Mish function as the activation function of the algorithm.Experiments have proved that the GM-YOLOv3 parameters are two-thirds of the original parameters,and the detection effect of the algorithm is also improved by 2.7% compared to the previous one.(2)For the GM-YOLOv3 algorithm,the feature map receptive field becomes smaller after down-sampling,and the image feature information is lost,which causes the problem of decreased detection accuracy.Under the condition of ensuring that the model parameters do not increase,the GM-YOLOv3 model structure is improved based on the YOLOv3-tiny model.In order to better measure the accuracy of the prediction box,the GIo U regression loss calculation method is integrated into the improved model.Finally,borrowing the idea of the pyramid pooling layer in SPP-Net,the spatial pyramid pooling(SPP)layer is introduced before the upsampling operation of the three-scale prediction of the GM-YOLOv3-tiny algorithm.The SPP layer expands the receptive field of the image,divides the feature map obtained after the convolution operation into finer to coarser levels,and aggregates the local features of the image through the Concat operation,which solves the loss of feature information in the downsampling process of the GM-YOLOv3 algorithm problem,strengthen the learning ability of the network.Experiments show that the GM-YOLOv3-tiny algorithm reduces the number of parameters by a quarter and increases the accuracy rate by 2.5% when the detection time and average detection accuracy are almost the same as the GM-YOLOv3 algorithm. |