| With the rapid development of economy and the increasing population base,largescale crowd gathering activities are becoming more and more frequent,which brings serious security risks to society.Intelligent video surveillance system can guarantee social public safety and has high application value in the field of security.The crowd counting,as an important part,aims to estimate the total number and spatial distribution of people in different scenes.With the support of artificial intelligence technology and computer hardware resources,the research on crowd counting has gradually deepened and made outstanding progress.However,the existing counting algorithms still face many challenges,such as crowd misjudgment caused by background noise in complex scenes and scale variation caused by perspective effect.In order to further improve the performance of the algorithm,this dissertation studies the above problems,and the specific work and contributions are as follows:1.Aiming at the problem of crowd misjudgment caused by background noise in complex scenes,a crowd counting algorithm based on foreground enhancement and hierarchical connection is proposed to alleviate the influence of background interference from the level of feature map and density map respectively.Firstly,a channel-attentive receptive field module is designed,and the expression ability of features is enhanced by connecting hole convolution with different receptive fields in parallel.Secondly,the foreground enhancement module is built to supervise the deep features to obtain the regional aware of the crowd,and generate attention to adjust the shallow features,thus alleviating the background interference problem at the feature level.Then,the deep features and the shallow features that are adjusted by attention are fused by hierarchical connection,thus enhancing the detail aware ability of the model.Afterwards,a double branch module is constructed,and the semantic segmentation branch provides attention diagram for the density regression branch to alleviate the background interference problem at the density map level.Finally,experiments are carried out on ShanghaiTech,UCF-QNRF,JHU-CROWD++ and NWPU-Crowd datasets.The experimental results show that the proposed algorithm achieves competitive counting performance,and the effectiveness of the proposed algorithm is verified by ablation experiments.2.Aiming at the problem of scale variation caused by perspective effect,a crowd counting algorithm based on Transformer and semantic enhancement is proposed.Firstly,the proposed algorithm models the global context information through Transformer,and obtains the global receptive field to cope with the continuous and drastic scale variation.Afterwards,a semantic enhancement module is built and embedded in the feature pyramid structure,and the position correspondence of adjacent features is established to deal with the problem of feature misalignment,and the importance of the corresponding positions of deep and shallow features is learned to carry out weighted fusion,so as to obtain multi-level fine-grained features with rich semantic information.Then,a feature selection module is constructed to aggregate multi-level features with strong semantic information,and generate the weights of feature levels to dynamically select effective features,so as to output high-quality crowd density estimation maps.Finally,experiments are carried out on several standard crowd counting datasets,and the experimental results show that the proposed algorithm achieves competitive counting performance.In addition,the effectiveness of the proposed algorithm is verified by ablation experiments. |