| With the closer integration of target detection and the field of UAVs,a large number of excellent target detection algorithms have emerged.The unique characteristics of the UAV aerial image data set have made high-precision and high-speed balanced target detection algorithms a research hotspot.Thesis discusses and improves the target detection algorithm based on UAV aerial images.The related improvements and experimental results based on the YOLOV4 network foundation are as follows:1.Aiming at the problems of image edge detection and image filling,at the input side,thesis studies and adds adaptive image input and adaptive image scaling.The image input uses Mosaic data augmentation method to randomly scale,stretch,and crop the four input images together.The target can be effectively detected,so that the recall rate and accuracy are improved,but the image preprocessing of each input reduces the detection speed.Image scaling uses an adaptive method to fill in the least black borders,which significantly improves the inference speed.2.Aiming at the problems of channel information interaction and parameter optimization in the network,in the backbone network,thesis studies and adds image slices and depth separable convolution.Image slices are taken out at equal intervals and superimposed on the channels.The purpose is to increase the input channel information interaction.The depth separable convolution first performs channel-by-channel convolution and instead of fusion between channels.The 1*1 convolution kernel is used for point-by-point convolution and channel fusion,which greatly reduces the amount of parameters.The improvement of the former effectively improves the detection accuracy,and the improvement of the latter significantly improves the detection speed.3.Aiming at the problems of multi-scale detection and occlusion target detection,on the prediction side,thesis improves and increases the dilated convolution and repulsion loss.Since the three parts of the model output prediction are responsible for the small,medium and large anchor boxes,the dilated convolution with expansion rate of 1,2,and3 is used to improve corresponding to the receptive field of the feature map,it effectively improves the effect of multi-scale changing target detection.In the regression position loss part,the rejection item in the repulsion loss is added to increase the distance between the predicted frame and the background and the actual frame of other categories,which improves the recall rate of the occluded target.However,the complexity of the loss function leads to an increase in the amount of calculation.The detection speed has decreased.Considering detection accuracy and speed,thesis combines all improvements except repulsion loss to get the most effective model.Training results on the Vis Drone2019 data set,the average detection accuracy reaches 43.75%,the average recall rate reaches57.13%,and the detection speed reaches 151 FPS.The recall rate and detection speed far exceed the best algorithm in the data set.Thesis analyzes the declining curve of the loss function during model training,and verifies the effectiveness of the improved algorithm.In addition,a test experiment was carried out on the test data set,and the practicability of the improved algorithm was verified from the output test effect diagram. |