Font Size: a A A

Research On Human Parsing Method Based On Deep Convolutional Neural Network

Posted on:2021-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:P NiFull Text:PDF
GTID:2428330620470572Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Human parsing is a complex and delicate segmentation task in the field of computer vision,and its extensive application value has attracted more and more researchers' attention.With the development of deep learning,convolutional neural networks have been successfully applied in the field of computer vision.Human parsing based on convolutional neural networks has achieved great breakthroughs,but in the face of changing human posture,complex scenes,and diverse challenges such as dressing,discontinuities in the analysis result area,incorrect recognition,and inaccurate results are still obvious.This paper studies the above problems based on convolutional neural network methods,including the following:First,considering the limited ability of a single network model to extract features,a multi-stage two-way human parsing network MTCnet is proposed.MTCnet combines encoding and decoding networks with atrous convolution.It has two feature extraction subnets that can learn multi-scale feature information.Compared with a single network,it can learn richer human semantic feature information.Different from the previous single-stage processing method,multi-stage learning is adopted,and intermediate supervision methods should be used for training.At each stage,the human body analysis results of the previous stage are improved,and finally the optimal analysis results are achieved.Experimental results show that the method proposed in this paper has stronger feature extraction capabilities and more accurate analytical results than some current advanced methods.Second,analyze and study the encoding and decoding network models Segnet and U-net.Although Segnet and U-net networks have a good learning of global and local information,the network only performs simple downsampling and upsampling operations,ignoring the difference between feature information for exchange learning,a human parsing method based on multi-level deep feature exchange network DFEnet is proposed.The DFEnet network not only considers high-dimensional feature learning at different resolutions,but also can satisfy feature exchange learning at different resolutions.After DFEnet has extracted human semantic features,the atrous hourglass pooling will perform multi-scale learning on the extracted features.The experimental results on the LIP dataset show that the method proposed in this paper has better analytical results.
Keywords/Search Tags:Human parsing, Atrous convolution, Intermediate supervision, Multi-stage, Feature exchange, Atrous hourglass pooling
PDF Full Text Request
Related items