Font Size: a A A

Research On Multi-view 3D Human Pose Estimation Base On Ddep Learning

Posted on:2023-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:C A ZhangFull Text:PDF
GTID:2558306914971289Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
3D human pose estimation is the basis of human tracking and human action recognition in the field of computer vision.Its task goal is to locate the position coordinates of human joint points in three-dimensional space in pictures or picture sequences,and connect them reasonably to describe the three-dimensional posture of the human body in the image.It has broad application prospects in human-computer interaction,virtual reality,autonomous driving,and action guidance.Due to the influence of complex factors such as occlusion and illumination,the existing deep learning-based 3D human pose estimation algorithms have large pose estimation errors.This thesis deeply explores the effective information in the feature map by improving the existing multi-view 3D human pose estimation network model to improve the accuracy under occlusion.The main work and results are as follows:1.In order to solve the problem that the existing convolutional neural network-based methods do not make full use of the ordering of global space and human position information in feature extraction,a multi-view 3D human pose estimation model based on position learning is built.In the 2D detection stage,position encoding is utilized to model relations between pixels in the image;attention mechanism is introduced in the backbone network to model feature dependencies in both channel dimension and spatial dimension.In addition,adjacent view features are used to enhance spatial expression ability of feature maps.The model use Resnet-50 as the backbone network.The mean per joint position error is reduced to 25.2 mm on the Human3.6m dataset,and the pose estimation error is reduced to 30.3 mm on the CMU Panoptic dataset.2.In order to solve the problem of insufficient utilization of internal relations between the same human joints from multiple perspectives in the existing multi-view 3D human pose estimation algorithms,this thesis builds a 3D human pose estimation model based on multi-view feature fusion.This thesis relies on the theory of epipolar geometry to establish the inner connection of points and lines between two views,which can make better use of the effective information in the adjacent views.At the same time,this thesis designs a feature fusion network.This network can effectively improve the spatial representation of feature maps in the 2D detection stage.The mean per joint position error of the model is reduced to 20.6 mm on the Human3.6m dataset.3.In order to solve the problems of insufficient utilization of images from multiple perspectives in multi-view datasets,and to improve the deconvolution based on up-sampling module in existing pose estimation algorithms,a multi-view 3D human pose estimation model based on high-resolution restoration is built.This thesis designs a non-fixed view combination strategy,which can make full use of images from multiple viewpoints in the dataset.Besides,this thesis designs an up-sampling module based on subpixel convolution.The model in this thesis reduces the mean per joint position error to 18.3 mm on the Human3.6m dataset.
Keywords/Search Tags:human pose estimation, deep learning, attention mechanism, multi-view
PDF Full Text Request
Related items