| The development of surveillance system in urban areas and its improved efficiency of detection have brought a new solution to criminal cases while video detection technology has also been continuously developed and widely used.Nevertheless,in the practical detection work,public security personnel often need to watch a large number of surveillance videos around crime scene to search the suspected target.At present,the officers search and analyze the pictures and trajectories of the people in the video mainly use their eyes to make the determination of the suspect target finally,which costs a lot of labor and material resources.With the continuous development of artificial intelligence technology and the stringent requirements for investigation in public security scenarios.Pedestrian recognition has attracted more attention especially in public security field.Person re-identification,formally defined as the problem of associating a given person acquired by a camera to the same person previously acquired by any other camera in the network at any location and at any time instant,which can be effective for public security personnel to quickly lock,troubleshoot and track suspects in the investigation tasks,and provides an assurance for the time of gold crime detection.The discriminative pedestrian features can provide a more reliable basis for subsequent distance metrics and ranking optimization technologies.Therefore,focusing on the appearance of pedestrians,and studying how to extract more robust appearance features to represent pedestrians are always significant to the research of relevant researchers.The popularity of deep features has attracted more attention in the expression of features of pedestrian.In addition,since an effective target segmentation region can exclude the interference of non-target features,it has a critical positive effect on target retrieval,and the deep features have been considered as the most expressive feature.Consequently,this paper focuses on the improvement of the pedestrian recognition component based on the discriminative pedestrian component area via deep learning methods.However,with the high-capacity of deep learning,the requirement of training data scale is gradually increasing,and the monitoring data under the practical detection video environment is more complicated.The pedestrian recognition based on the appearance of the segmentation area still faces the following three bottlenecks:(1)Insufficient labeling.In order to obtain the appearance features of a pedestrian in a specific area,it is necessary to train the pedestrian dataset to obtain a high confidence pedestrian parsing parts.However,in most practical environments,there are lacks of pixel-level labeled samples.If the model is learned directly by a public data set,an offset of the feature expression may probably occur.(2)Inaccurately parsing.The existing deep learning analysis model is constrained by the network architecture of pooling,convolution and other downsampling-like operations in the network,makes that the edges of object can not be accurately parsed.The ambiguous edge means that the interference is brought in when the pedestrian parts are extracted.It makes the feature expression not robust.(3)Low discrimination of extracted features.The complex factors such as change of angle of view and motion blur in cameras lead to the differences in the appearance of different pedestrians.Even if the background interference is removed,the recognition of pedestrian in appearance features is still inaccurate.To address this issue,this paper focuses on three aspects as model transferring,parsing model optimization and feature combination optimization,and has achieved the following innovative results:(1)Domain adaptation in human parsingWe establish the attribute association between the fashion datasets and the surveillance dataset with the rich information in the fashion dataset,and obtain the fashion map by extracting and utilizing the target appearance features of the surveillance datasets.Then we transfer the model trained in fashion domain to surveillance domain.The experimental results show that our transferred model can accurately analyze the pedestrian parts in the surveillance environment,which is significantly improved compared with the model directly trained by the fashion dataset,and is much better than the state-of-the-art approaches.(2)Compensation method of human parsing based on video sequencesThe deep learning based methods are used to ensure the confidence of each part of the adjacent multi-frame sequences.Then,the optical flow is used to establish the mapping relationship between different frames of the same part,and the pedestrian dynamic variation characteristics are used to constrain the components.Finally,the video sequence-based CRF(Conditional random field)is designed to inference.The experimental results show that the proposed method can effectively solve the fuzzy segments in the pedestrian sequence,and the corresponding results of these sequences can feedback the parsing model and improve the pedestrian parsing performance on a single static image.(3)Combination optimization with multi-task deep modelBy searching for the most discriminating human parts and combining them to replace the traditional pedestrian global features,the feature distance of the same target is drawn closely,the feature distance between different targets is increased leading to optimize the person re-identification performance.The experimental results show that our algorithm has a higher improvement than the traditional person re-identification algorithms based on the global feature and local point(or patch)features.In summary,this paper addresses to the video detection technology bottleneck,and completes the theoretical research on domain crossed expression,dynamic feature feedback,combination optimization of partial features in person re-identification feature extraction technology.It can provide new support in real-time video rapid detection applications in basic theory and key technologies of person re-identification. |