| Widely used in fields such as intelligent surveillance,autonomous driving and security robots,Visual target tracking has always been an important research field in computer vision.Aiming at target tracking,this paper is based on deep neural network to research and improve the three directions of face recognition,pedestrian detection and tracking target,and realize the visual target detection and tracking in complex scenes.The main research contents of this article are as follows:First of all,in the face detection and recognition process,it is difficult to detect the target in the outdoor complex environment,missed detection,and unable to identify the target identity,so it is proposed to improve the MTCNN face detection algorithm.Perform data augmentation preprocessing on the data set first to reduce the dependence of the fitted model on a specific data set.Then,the face recognition algorithm based on MTCNN is improved from four aspects:light preprocessing the input image,and adaptive brightness adjustment algorithm based on Retinex theory to reduce the impact of light on the image;loss function for face classification optimization,the FocalLoss function is used as the face classification loss function to improve the accuracy of face detection;weight self-learning is performed on the output of the R-Net network,so that the layer of the network pays more attention to the key points of the face;in the FaceNet network,the Triplet Loss function is optimized and combined with the Arcface Loss to improve the discrimination of face features.Experimental results show that the improved MTCNN algorithm improves the detection and recognition performance of face targets in complex scenes.Then,for the existing pedestrian detection algorithms in complex scenes,there are problems such as missed detection of small targets and difficult to detect occluded targets,and a pedestrian detection algorithm based on improved PP-YOLO is proposed.In order to make full use of the information of each layer in the feature pyramid,this paper uses ASFF adaptive spatial feature fusion to replace the FPN in PP-YOLO to realize the feature information of each layer is fused with each other,and the Mish activation function is used in the feature of the PPYOLO network structure.This part of the pyramid improves the detection accuracy.At the same time,the DIoU loss function is introduced to improve the accuracy of target positioning.The results of ablation experiments and qualitative experiments show that the improved PP-YOLO algorithm improves the detection performance of pedestrian targets in complex scenes.Finally,based on the face detection and pedestrian detection algorithms proposed in this paper,a human tracking model is modified to realize visual tracking of specific targets by associating human faces with pedestrians.First,the improved pedestrian detection algorithm PP-YOLO is introduced into the Deep SORT pedestrian tracking model to improve the performance of the pedestrian detection part.Then,using the characteristics of the video sequence,it is proposed to use the rate of change of the center position to correlate the motion characteristics to improve the tracking speed.Experiments show that the visual target tracking algorithm based on deep learning proposed in this paper improves the accuracy of target tracking. |