Font Size: a A A

3D Human Pose And Shape Estimation Based On Deep Learning

Posted on:2022-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:C Y GuiFull Text:PDF
GTID:2558307070952439Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
3D human pose and shape estimation is an important computer vision task,which has very important application value in the fields of video surveillance,human-computer interaction,and three-dimensional special effects production.With the rise of deep learning,3D human pose and shape estimation technology has been rapidly developed,but there are still many problems,such as huge domain gap among different datasets,poor rendering effect of irregular pose estimation results,and poor estimation accuracy under occlusion.This thesis proposes three targeted solutions to the above three problems,and the specific content is as follows:(1)3D human pose and shape estimation method based on adaptive multi-domain learning.It is extremely difficult to obtain the ground truth labels for 3D human pose and shape estimation in outdoor scenes.In order to solve this problem,several methods usually combine multiple existing datasets to train the model.However,due to the large domain gaps among multiple datasets,the estimation accuracy of their model in the outdoor scene is low.To this end,a 3D human pose and shape estimation method based on adaptive cascaded multi-domain learning is proposed.First,the image is passed into the backbone network to generate features,and then the features are sent to a cascaded multi-domain learning module,allowing the domain-related features can be extracted adaptively from coarse to fine.This feature is more conducive to the 3D human body model parameters Prediction.The proposed method using hybrid training model outperforms state-of-the-art method on standard outdoor datasets.The proposed method effectively solves the problem of low prediction accuracy in outdoor scenes caused by domain differences between datasets.(2)3D human pose and shape estimation method based on multi-candidate prediction.For most human body pose and shape estimation methods,when unconventional human body poses appearing,the problem of poor rendering effect of the predicted 3D model is common.This is caused by the inaccurate prediction of the 3D human body model parameters and camera parameters.To solve this problem,a 3D human pose and shape estimation method based on multiple candidate predictions is proposed.First,the conditional variational autoencoder is used to predict the candidates of the multi-combination 3D human body model parameters and camera parameters.Then,the multiple sets of parameters are sorted based on the 2D information.Finally,the optimal parameter set is selected as the final parameter set of prediction.The experimental results show that the proposed method has a certain improvement compared to the baseline method on indoor and outdoor datasets,and achieves comparable performance against the state-of-the-art methods.The proposed method effectively solves the problem that the prediction results of unconventional human poses are rendered poorly on the image.(3)3D human pose and shape estimation method based on texture map completion in occluded scenes.Most methods do not specifically deal with the estimation of human pose and shape under occlusion scenes.However,occlusion scene is one of most common scenes in visual data,and the estimation accuracy of these methods is generally low when occlusion occurs,so it is necessary to do special research on 3D human pose and shape estimation under occlusion scenes.To solve this problem,a 3D human pose and shape estimation method based on texture map completion is proposed.The texture completion branch is used to assist the position information prediction branch,improving the prediction accuracy under the occlusion scenes.Further,multi-stage training and a variety of weakly supervised loss functions are used,allowing our method can estimate accurate 3D human body model under occlusion scenes.Experimental results show that the proposed method outperforms the state-of-the-art method by nearly 15 points on the occlusion dataset 3DOH50 K,demonstrating that the method effectively solves the problem of 3D human pose and shape estimation in the occlusion scenes.
Keywords/Search Tags:3D human pose and shape estimation, multi-domain learning, conditional variational autoencoder, texture map completion
PDF Full Text Request
Related items