| With the continuous growth of population and the development of urbanization,accidents such as stampede caused by overcrowding occur frequently,posing serious safety risks to society.Crowd counting aims to predict the total number of people in various scenarios and present their distribution,which plays an important role in security surveillance and traffic monitoring.In particular,with the COVID-19 still raging around the world,crowd flow warning and control have become crucial,and crowd counting has been widely studied in recent years as the most basic crowd analysis method.In this paper,we address the problems of uneven crowd density,severe occlusion and background misprediction in complex scenes,and investigate the network structure and supervision mechanism respectively,which significantly improve the model performance with very small spatial cost.The main research and innovations made in this paper are as follows.(1)A spatial aware crowd counting algorithm based on feature enhancement is proposed to address the deficiency of head information in complex scenes due to light and weather varies as well as camera viewpoint changes and occlusion between crowds.The expressiveness of crowd features is enhanced by improving the quality of spatial coding and exploring high response features in the channel and spatial dimensions,which subsequently improves the representational ability of the network.To improve the focus on key features and difficult example pixels,the key areas loss is proposed to supervise the under-prediction on foreground and errors in easy-to-misjudge region to further improve the model performance.(2)A foreground-enhanced crowd counting algorithm based on multi-scale aware mechanism is proposed to address the variation of head scales caused by camera viewpoint changes and the under-prediction of foreground and background misjudgment caused by various external factors.The foreground enhancement branch is trained to adaptively find the key region during feature extraction and reduce the under-prediction and misprediction by emphasizing the features on the region.Meanwhile,to help the network adapt to the multi-scale morphology of the head,the size of the test image is changed as the data enhancement while these images are trained by the same count label,which improving the scale perception of the network.(3)A density-aware crowd counting algorithm based on and regional supervision mechanism is proposed to adapt to the uneven crowd density distribution and decrease background errors in the crowd counting task.To reduce the impact of uneven densities on model performance,this algorithm improves the density-awareness of the network by evaluating the importance of different features in space,and adaptively enhances the dense region during the training,which alleviates the poor prediction caused by severe occlusion,and thus improves the density estimation ability and counting ability of the model.To alleviate the impact of background misprediction on the quality of density estimation map and counting results,this algorithm help the network recognize human head features by improving foreground counting accuracy and penalizing background errors,which significantly reduces background misprediction and further improves the density estimation ability and counting ability of the algorithm.In this paper,rich and detailed experiments are designed on four commonly used crowd datasets.Not only the validity of the proposed method is verified,but also many details of related contents are discussed,including but not limited to parameter selection,efficiency analysis and deficiency analysis.Extensive experimental results verify the effectiveness,generality,and robustness of the proposed method. |