| Unmanned aerial vehicle(UAV)has been widely applied in many fields such as traffic monitoring,security prevention and control,and photographic mapping in recent years,due to its strong maneuverability and practicability.Real-time vehicle detection from UAV imagery is a key technology in many applications,such as vehicle tracking,road condition acquisition,and safety inspections,at the same time,it has practical research significance and application requirements.However,due to the uncertainty of shooting height,the diversified scenes,and the external environmental influences such as bad weather and occlusion of buildings,vehicles from UAV images are occluded,blurred,small-sized,and complex backgrounds.Therefore,fast and accuracy UAV vehicle detection exists many challenges.Starting from the practical difficulties of UAV vehicles,we use high-level semantic context information,low/mid-level information,and other rich feature information that are suitable for UAV vehicle to construct the corresponding deep learning detection framework.And three different real-time multi-scale feature detection algorithms are proposed for UAV vehicles,which achieve fast and accurate vehicle detection in complex scenes.The main works in this paper include the following three aspects:Firstly,in view of the UAV’s perspective,most vehicles are difficult to detect for small size,complex background,and the imbalance between hard and easy examples.The effective contexts are used to construct a real-time multi-scale feature fusion detection algorithm for weak and small vehicles.Different from the classic multi-scale feature fusion algorithm introduced by the top-down architecture,we combine shallow feature and the adjacent deep feature.The purpose is to provide suitable and effective contextual information from highlevel semantic features for small-sized vehicle.It does not introducing too much background interference,and the same time,it can form a multi-scale feature pyramid to solve the problem of vehicle scale diversity.In addition,we design suitable anchors according to the distribution of UAV datasets and the effective receptive field of vehicles in the convolutional neural network to improve the recall rate.Finally,in view of the imbalance between a large number of easy examples and a small number of hard examples during the training process,an alternate training strategy from the cross-entropy loss function and the focal loss function is proposed to adaptively increase the training weight of hard examples,and improve the vehicle detection performance.Experimental results show that the proposed vehicle detection algorithm preliminarily solves the problems caused by the small size and the imbalance,and can achieve accurate and real-time detection of vehicles from UAV imagery.Secondly,Aiming at the detection algorithm that only considers the transmission of highlevel semantic information,it is difficult to achieve accurate classification and location of vehicles from UAV imagery.We use the low/mid-level feature information that can accurately describe the vehicle target and the high-level semantic information that can distinguish other categories to complete the bi-directional information transmission in this paper.And a real-time anchor-based vehicle target detection algorithm with feature information enhanced multi-scale feature fusion is proposed.We design a light-weight multi-scale feature network to extract low-/mid-level feature information,and embed it in the backbone network for accurate location.At the same time,the high-level semantic information is extracted from the backbone,which is beneficial to differentiate the target vehicle from background or other vehicle categories.Then we use multi-rate dilated convolution for feature extraction,in order to obtain richer low-/mid-level information from light-weight multi-scale feature network.And an effective feature fusion module is presented,in order to integrate the low-/mid-level information into the backbone to enhance the discriminative features for vehicles.Finally,a channel feature enhanced module is designed to suppress redundant features,and further improve the discrimination for vehicles.Experimental results show that the proposed method can better realize the real-time and accurate classification and location of UAV vehicles.Thirdly,anchor-based UAV vehicles detectors are difficult to detect vehicles for the imbalance between positive and negative examples,complicated anchor setting,excessive calculations,and objects large scale variation.We introduce anchor-free mechanism to increase the number of positive examples to a certain extent,thereby enriching the vehicle feature information,which can alleviate the imbalance between positive and negative examples during the training process,and avoid the related anchor settings.Then,an effective backbone is designed that can adapt to the anchor-free mechanism,in order to maintain precise localization information provided by the contexts with high resolution,and offer appropriate receptive fields that match the small-sized vehicles.Meanwhile,a multi-scale semantic enhancement block is proposed,which can enhance the discriminative feature representation for vehicles without reducing the resolution of the prediction feature layers.Experimental results show that the proposed method can greatly improve the overall performance of UAV vehicle detection,and can achieve real-time detection speed.Especially for smaller vehicles,its accuracy is about 2%higher than other methods.In summary,the proposed methods can achieve high accuracy and fast detection for vehicles from UAV images,and show a more ideal detection effect to lay a solid foundation in more complex traffic environment. |