| With the increase of population and the development of cities,the number of large public places shows rapid growth.By analyzing the video data of public places with computer vision technology,pedestrian information can be obtained,and then the analysis of people flow in public scenes can be completed to strengthen the management and security of public scenes.After analyzing the YOLO v5 algorithm,it is found that the network architecture design is mainly based on convolution,but the network and global feature extraction ability based on convolution are insufficient,and the image coding of Vision Transformer is redundant in the visual task,which will lead to the failure of obtaining pedestrian information.Secondly,the background in the public scene usually interferes with pedestrians to some extent,and the movement of pedestrians is irregular,which easily leads to repeated counting of the same target,resulting in errors in the number of pedestrians finally obtained.Based on the above problems,this paper carries out work from three aspects:end-to-end target detection network design,multi-target pedestrian tracking and human flow analysis.The main innovative work of the paper is summarized as follows:(1)The YOLO v5 trunk structure is improved in combination with Vision Transformer.The token of Vision Transformer is improved and the token-based attention mechanism is combined to enable the trunk network to better learn useful features.(2)In order to make the pedestrian count more accurate and reduce the repeated count,the counting area is set in part of the image and the movement direction of pedestrians is distinguished.When counting is realized,the flow direction of pedestrians can be further analyzed.In addition,count lines are added to eliminate the interference of irrelevant background.The experiment shows that the improved and pedestrian detection algorithm based on YOLO v5 can record the pedestrian flow according to the direction of pedestrian movement and eliminate the interference of irrelevant background.Training and testing on public pedestrian detection datasets,namely Caletech and COP datasets,show that the system is robust. |