| With the vigorous development of the city,the construction of the subway is crucial to prevent traffic congestion.For this reason,rapid and precise detection of pedestrians in the underground and real-time monitoring of passenger flow have become important factors to guarantee the safe operation of the subway.At present,computer vision promotes the emergence of new technologies.Pedestrian detection can learn from the thought of target detection,but there are different problems.Because of the particularity of pedestrians,the existence of various postures and clothing,and the existence of variable scales make pedestrian detection face greater challenges.Due to the complexity and particularity of the subway environment,there will be more or less problems with the pedestrian data set collected by the subway surveillance camera.For example,there will be motion blur or deformation during the shooting of pedestrian movement and there will be uneven light caused by lighting in underground subway stations,etc.Motion blur,deformation,and uneven illumination will affect the quality of the picture,and to a certain extent this will affect the extraction of pedestrian features in the image by the detection network,which will cause missed detection or false detection.At the same time,the operation of the detection model in practical applications is often accompanied by the consumption of a large number of calculations and memory resources,and the real-time requirements for model processing are getting higher and higher.This article carries out specific research around the following points:1.The subway pedestrian images with motion blur are deblurred by the improved Deblur GAN network,at the same time,the uneven illumination of the images is corrected,and then images are made into five different data sets.And we use detection networks such as YOLOv3,YOLOv4,and Faster R-CNN to compare the detection effect between the original data set and the deblurred and light corrected images.The results show that the performance of the model that the data sets processed by deblurring and light correction are used for training and in the test has been improved,and the best results are obtained after two treatments.2.In view of the particularity of subway pedestrians,the anchor frame size of the detection network and training strategy are optimized.The anchor frame size of the detection model is optimized by clustering,and the new anchor frame size is obtained,which makes it more consistent with the human headshoulder size in subway pedestrian data set.We use SWA(random weight average)training strategy and use cosine learning rate for 12 epochs training after the usual training,taking the average weight as the final weight,which improves the m AP(mean accuracy)of the detection model.3.For the purpose of improving the accuracy and processing efficiency of the detection model,the feature extraction backbone network and convolution module of YOLOv4 have been optimized purposefully based on the characteristics of lightweight network structure of the Mobile Net series.The PANet network in the Neck part of YOLOv4 has been modified adding adjacent layers to using concatenation connection to improve the accuracy of prediction.At the same time,the common convolution in PANet and the detection head yolo-head of YOLOv4 is modified by referring to the construction of deep separable convolution,which further decreases the quantity of model parameters.Finally,the overall architecture design of the detection model has been made based on YOLOv4 to ensure that the backbone network outputs three-scale features in order to extract features of subway pedestrians of different scales.And the detection network before and after the improvement is trained and tested on a unified data set.Through multiple sets of comparative experiments,it is shown that the above optimizations can enhance the m AP of the detection model.Combining the optimal data set and the improved network,the m AP of the final detection model can reach91.31%,and the detection speed can reach more than 30 FPS,which is meeting the demands of real-time processing. |