Research On 3D Human Pose Estimation Based On Spatiotemporal Semantic Graph Attention

Posted on:2023-01-24

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Guo

Full Text:PDF

GTID:2568306848467054

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

3D human pose estimation refers to locating the 3D coordinates of human joint points from images or videos.Aiming at the inaccurate prediction caused by occlusion and complex pose,this paper explores the use of human body topology and time information to improve the effect of 3D human posture estimation.The main work is summarized as follows:Aiming at the problem of inaccurate 3D pose estimation due to occlusion and ambiguity,this paper proposes to use the prior information of the human pose topology,combined with the network structure of the graph convolution and attention module,and use the attention module to extract the global pose information.The graph convolution captures the spatial constraint information between adjacent joint points and strengthens the influence between adjacent joint points.Finally,the pose representation is regressed to the3 D pose space through a linear layer,and the 3D human pose is obtained.Aiming at the jitter problem of single-frame prediction in the temporal dimension,this paper uses temporal and spatial information to construct a frame-level progressive aggregation network based on the spatiotemporal Transformer,and uses a spatial encoder to model the relationship between human joints in each frame in the video.The pose representation with temporal information is obtained through a temporal encoder,the local temporal information is aggregated by strided convolution,the sequence length is gradually reduced,and finally,the network is focused on predicting the 3D pose of the intermediate frames of the video.Aiming at the problem of occlusion and information loss when extracting spatiotemporal information separately,this paper proposes to add more spatial constraint information,construct a spatiotemporal graph attention network,and use attention to model global spatial information for spatial information extraction,improve the adjacency matrix in graph convolution,increase local spatial information constraints on kinematic connections and symmetry,highlight the role of local information in estimating the pose of occluded parts,and use temporal convolutional networks to model in the temporal dimension.In order to reduce the loss of space-time information,an interleaved network is constructed using temporal convolution and graph attention modules,and finally,the network is used to predict 3D poses.In order to verify the effectiveness of the method in this paper,quantitative and qualitative experiments are carried out on the public datasets Human3.6m and Human Eva.The experimental results show that compared with other similar methods,the model constructed in this paper significantly improves the accuracy of prediction.

Keywords/Search Tags:

3D human pose estimation, graph convolutional network, temporal convolutional network, self-attention, Transformer

PDF Full Text Request

Related items

1	Research On 3D Human Pose Estimation Based On Attention Mechanism
2	Human Pose Estimation Based On Convolutional Neural Network
3	Research On Human Pose Estimation Algorithm Based On Transformer
4	Design And Implementation Of Human Behavior Recognition System Based On Pose Estimation And Graph Convolutional Neural Network Construction
5	Research On 3D Human Pose Estimation Based On Monocular Video
6	Human Pose Synthesis Based On Two-stream Deep Neural Network
7	Human Pose Estimation By Deep Learning
8	Research On Understanding Interactive Behavior Of Human Pose Information Reconstruction Based On Skeletal And Image Features
9	Research Of Human Pose Estimation Method Based On Convolutional Neural Network
10	Researches On Multi-person Human Pose Estimation In Natural Scene