| With the development of UAV remote sensing technology,UAV aerial image detection has gradually become a core technology in UAV application fields,and has important application values in traffic planning,military reconnaissance and environmental monitoring.In recent years,artificial intelligence technology has developed rapidly,and methods based on deep learning have been widely used in object detection tasks,and have achieved better results.However,there are problems such as small size and dense distribution of targets in UAV images,and the deployment of models is also prone to problems such as too many parameters and slow computation,making it difficult to achieve accuracy and efficiency.To solve the above problems,this paper proposes an improved object detection improvement algorithm of BLUR-YOLO and EM2-YOLO based on the YOLOv4 model for the problem of more small targets in aerial images and limited memory and computation of UAV devices.The main contents of this paper are as follows:(1)To address the problem of small and densely distributed target sizes in aerial images,an improved model based on multi-scale feature fusion,BLUR-YOLO,is proposed.first,hswish activation functions are used in the backbone and neck networks to increase the expressive power of the model;second,a Coord Attention attention mechanism is added to the bottleneck layer of the backbone network,thus increasing the effective information weight and suppress background noise interference;finally,in order to strengthen the connection between low-level features and high-level features,a feature pyramid network(Blur-PANet)is proposed to effectively perform multi-scale feature fusion.Finally,Proven through experiments,the proposed model has a 1.2% higher m AP than the original model on the Vis Drone2019 dataset.(2)In order to facilitate the deployment of models on UAVs,we propose a lightweight network improvement model EM2-YOLO based on the attention mechanism.The inability to deploy too large models on UAVs requires fewer model parameters,so we build lightweight feature extraction networks using deeply separable convolution and the attention mechanism.First,we compare several lightweight feature extraction networks and choose to use Mobile Net V2 to replace the backbone network,thus reducing the computational complexity of the model;second,we compare the advantages and disadvantages of several common attention mechanisms and choose the ECA attention mechanism with interpretability and lightweight features;finally,we use the EIOU loss function to calculate the loss of the outer frame regression.Proven through experiments,the proposed model has an m AP of 81.1% on the PASCAL VOC dataset and less than half the number of participants of the original model. |