| With the continuous improvement of people’s living standards,all kinds of large gatherings emerge one after another,resulting in frequent crowds.At present,there are many challenges in the density estimation of dense crowds in complex scenes,such as crowd scale differences,background interference and occlusion.The work of crowd counting uses the pictures and videos collected by the camera as the data source.The distance of the scene crowd relative to the acquisition device will lead to the change of the crowd scale.Buildings,vehicles,plants and billboards will cause occlusion and interference,which seriously affects the accuracy of crowd counting.When using convolutional neural network to process crowd images,the pooling operation of the pooling layer will cause the loss of image feature information and affect the accuracy of crowd counting.Therefore,in this paper,the crowd density estimation algorithm for complex scenes is studied in view of the shortcomings of crowd scale differences,background interference,object occlusion and pooling operation that affect crowd counting.Aiming at the problem of crowd scale difference,background interference and occlusion in the image,this paper designs a crowd density estimation algorithm combining attention and hybrid atrous convolution.The network uses the SE attention mechanism module to channel weight the head position features in the image,suppress the non-head features in the image,and expand the network receptive field by hybrid atrous convolution.At the same time,the mixed dilated convolution with mixed dilation rate also avoids the appearance of "grid effect".The front-end network is constructed based on the first 10 layers of the VGG-16 network,which is used to carry the hybrid atrous convolutional network of the back-end.Through the combined effect of SE attention mechanism and hybrid dilated convolution,the crowd density map with higher quality and accuracy is finally obtained.Aiming at the problem that the pooling operation of convolutional neural network will cause the loss of feature information,this paper designs a crowd density estimation algorithm with upsampling.In the feature extraction module,the first 10 layers based on the VGG-16 network constitute the front-end network,and the back-end network is composed of 6 layers of mixed dilated convolution layers.After processing by the feature extraction module,the feature map is output.Through the dense upsampling operation of the upsampling module,the feature map is transformed into multiple feature submaps according to the network pooling factor,and the channel is stacked.The size of the new feature map is the same as the size of the original image fed into the network,thus compensating for the feature information lost by the pooling operation.In order to verify the feasibility of the proposed algorithm,the proposed algorithm was trained and tested on two public data sets Shanghai Tech and Mall.Experimental results show that the proposed algorithm is suitable for crowd counting tasks in complex scenarios.By comparing with the current mainstream crowd density estimation algorithms,it is proved that the proposed algorithm is better than some mainstream algorithms and improves the accuracy of crowd counting. |