| Human pose estimation,as an important task in the field of computer vision,has always received widespread attention from researchers.With the development of deep learning technology,2D human pose estimation technology has made significant progress,and people are gradually turning their attention to 3D human pose estimation tasks.3D human pose estimation aims to recover the skeleton model of the human body in 3D space from image data or video data.It is a fundamental technology of virtual reality and augmented reality technology,and is widely used in fields such as sports,human-computer interaction,and movie game production.Compared with 3D human pose estimation technology based on a single perspective,3D human pose estimation technology based on multiple perspectives has higher computational accuracy,wider application range,and stronger theoretical foundation.Therefore,the problem of 3D human pose estimation from multiple perspectives has attracted more and more research institutions and researchers.However,multi-view 3D human pose estimation still faces significant challenges,including issues such as dataset scarcity,human occlusion by scenes,and human interaction occlusion.These issues seriously affect the accuracy of 3D human pose estimation results.Therefore,how to solve these problems is the next research direction for multi-view 3D human pose estimation.In response to the above issues,this article focuses on the field of multi-view 3D human pose estimation,improving existing solutions to improve the accuracy of 2D human pose estimation and3 D human pose estimation results.The main work of this paper includes:(1)The paper constructs a multi-view camera acquisition system and constructs a strong interactive and occlusive human pose dataset with annotated information.This paper proposes a single person 3D human pose estimation method based on multi-feature extraction to address the issue of human occlusion in multi-view human pose estimation.A joint point confidence network module and a heatmap weight learning network module are designed to extract the confidence of human joint points in the input perspective and the weight matrix of 2D human heat maps.The human heatmaps from each perspective are weighted and fused through the weight matrix to obtain the weighted heatmap.The weighted heatmap and the confidence of human joint points are input into an improved triangulation algorithm to restore the final 3D human posture,This effectively alleviates the occlusion problem in single person 3D human pose estimation task.(2)On the basis of part(1),this paper proposes a multi-person 3D human pose estimation method based on multi-channel matching algorithm for the task of multi-person 3D human pose estimation.This method is based on an efficient cross perspective 2D human body matching algorithm,which can accurately restore the skeleton shape of the multi-person 3D human body.This method first performs 2D human pose estimation on the input multi-view image data,generating 2D human poses from various perspectives.Then,using a designed cross view multiple matching algorithm,the 2D human poses from each perspective are clustered.Then,the heatmaps generated by the 2D human pose detector are fed into the heatmap weight learning network module designed in part(1)to obtain the weighted heatmap,Finally,the clustered 2D human pose results and weighted heatmaps are fed into the triangulation algorithm to restore the 3D skeleton models of each human body.Due to camera calibration errors and noise interference in the actual workflow,this paper uses the least squares estimation method to solve the optimal 3D human key point positions.The experimental results show that this method can effectively alleviate the strong interaction and occlusion problems in multi-person 3D human pose estimation,thereby improving the accuracy and robustness of multi-person 3D human pose estimation.(3)On the basis of part(1),this paper investigates the problem of redundant view information in multi-view datasets,and investigates the problem of de-view in multi-view 3D human pose estimation.By introducing three commonly used image similarity algorithms,namely mean hash,difference hash,and perceptual hash,in the 3D human pose estimation task,a de-view experiment was conducted on the Occlusion Person public dataset,reducing the number of views on the Occlusion Person dataset from eight to four.In addition,by using the heatmap weight learning network module designed in part(1)to extract the appearance and geometric features of the heatmap,the quality of the heatmap from various perspectives is evaluated,and the weight matrix is used to reflect the quality of the heatmap.For the view angles corresponding to poor quality heatmaps,we remove them from the subsequent human pose estimation steps,thereby reducing the number of input perspective images in the system.This greatly improves the model inference speed and training speed while ensuring the accuracy of multi-view 3D human pose estimation.The multi-view 3D human pose estimation method proposed in this article fully utilizes the views’ feature information and heatmap information,and has achieved ideal results in single person3 D pose estimation,multi-person 3D pose estimation,and 3D human pose dataset de-view tasks. |