Research On 3D Human Pose Estimation Based On Cross View Feature Fusion

Posted on:2024-01-30

Degree:Master

Type:Thesis

Country:China

Candidate:Q Zhang

Full Text:PDF

GTID:2568306941989249

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

3D human pose estimation can characterize the human body intuitively and clearly,and it is widely used in motion analysis,medical reconstruction and other fields,with extremely important research significance and industrial application value.The use of multi-view input can effectively supplement the information from multiple angles of the human body in the scene,solving the adverse effects of occlusion,uneven lighting,etc.caused by the shooting angle.This thesis focuses on 3D human pose estimation in multi-view scenarios to further improve the accuracy,robustness and generalization of multi-view 3D human pose estimation algorithms,and investigate multi-view 3D human pose estimation from both single person and multi-person aspects.The main research points of this thesis are as follows:In multi-view 3D single human pose estimation,in order to solve the problems of current methods such as insufficient information mining of the view itself,current 2D pose estimation results of different views are fused with the same weights and affect the final 3D results,this thesis proposes a 3D single human pose estimation method based on visual Transformer,which introduces a self-attention mechanism with position embedding in the feature extraction stage with Long distance dependence is introduced in the feature extraction stage by self-attention mechanism with position embedding,and the final result is constrained by using human structure,and the fusion weights are adaptively adjusted according to different 2D pose feature quality in the feature fusion stage.The results on the publicly available dataset Human 3.6M show a 5%improvement in each metric compared to the current mainstream methods.In multi-view 3D multi-person pose estimation,in order to deeply mine multi-view feature information and simplify the process of joint feature cross-view matching,this thesis proposes a 3D multi-person pose estimation method based on cross-view joint coding,which focuses on features from other views by representing each feature point as a learnable positional embedding,while encoding the body-joint-view;the feature information fusion module is based on pairwise polar geometry and triangular dissection are implemented;the multi-person pose regression module makes confidence judgments on joint point projections through convolutional networks based on the localization and joint grouping results from the predecessor module,while constraining the spatial geometric relationships based on the view inputs.Extensive experiments on publicly available datasets show that the proposed method achieves comparable results with mainstream methods.

Keywords/Search Tags:

human pose estimation, vision transformer, deep learning, cross view feature fusion

PDF Full Text Request

Related items

1	Research On Deep Learning Algorithms For 3D Human Pose Estimatio
2	Research On 3D Human Pose Estimation Based On Multi-View Feature Fusion
3	Human Pose Estimation Based On Deep Learnin
4	Research On Human Pose Estimation Algorithms Based On Deep Learning
5	Research On Human Pose Estimation Algorithm Based On Deep Learning
6	Human Pose Estimation By Deep Learning
7	Research On The Human Pose Estimation Method Based On The Deep Learning
8	Research On Two-dimensional Human Pose Estimation Algorithm Based On Deep Learning
9	Research And Application Of Multi-view Human Action Matching Based On Deep Learning
10	Research And Application Of Human Pose Estimation Model Based On Multi-scale Feature Learning