Font Size: a A A

Research On Hand Pose Estimation Based On High-resolution Preserving And Graph U-net

Posted on:2022-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:J XiongFull Text:PDF
GTID:2518306731465734Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Hand is one of the most frequently used limbs in life and work.Compared with human body,hand also presents the advantages of flexibility,convenience and efficiency,and plays an important role in human-computer interaction(HCI).The research on hand pose can be divided into two categories: gesture recognition and hand pose estimation.The purpose of gesture recognition is to extract high-level abstract information of the hand and complete the task of gesture classification.Hand pose estimation is to calculate the joint positions of hand pose,including 2D coordinates and 3D coordinates.Compared with gesture recognition,hand pose estimation can provide richer and more accurate information and give more accurate instructions to smart devices.It has a wide range of applications in contactless human-computer interaction.Therefore,hand pose estimation has important research significance and research value.The input of hand pose estimation can be roughly divided into two types: depth image and color image.The hand pose estimation based on depth images has achieved good results in terms of accuracy,which is related to the 3D information of depth images,the easy acquisition of datasets and the continuous innovation of deep learning algorithms.However,depth images come from depth camera,which has certain limitations in practical application scenes,so hand pose estimation based on color images is more practical.However,due to the characteristics of hands,there are problems such as self-similarity and self-occlusion,and color images do not have the depth information of the third dimension.Therefore,3D hand pose estimation based on color images is a huge challenge.In order to solve some difficulties in the hand pose estimation of color images,this paper starts with two aspects of enhancing the 2D feature representations and enhancing the3 D pose derivation,focusing on the two deep learning algorithms of convolutional neural network(CNN)and graph convolutional neural network(GCN).A hand pose estimation method based on high-resolution preserving and perspective-invariant method and a hand pose estimation method based on high-resolution preserving and adaptive graph U-Net are proposed.The proposed method has the following improvements:(1)High-resolution Network(HRNet)is used for 2D hand pose estimation.Since HRNet always keeps high resolution representation learning in the process of prediction,it retains as much detail information as possible,and enriches feature representation through the multi-scale fusion process,so that the high-resolution representation of the final output is beneficial to the subsequent 2D heatmap detections of keypoints.(2)The 2D keypoint coordinates are calculated from the 2D heatmaps using Integral Pose Regression method.This method combines the advantages of heatmap and regression method,reduces the quantization error caused by extracting 2D joint coordinates from 2D heatmaps,and enables the whole network to be trained end-to-end.(3)Adaptive Graph U-Net is used to derive 3D hand pose from 2D hand pose.Graph convolutional neural network can naturally simulate kinematic constraints between joints,and is suitable for 3D derivation of hand joints.The network learns the connections between joints through an adaptive method,while the U-Net structure can integrate global features and local features to enhance the feature representation.(4)The characteristics of the middle layer in the process of 2D hand pose estimation and the 2D joint coordinates obtained by integral pose regression are fused as the input of the graph convolutional neural network to derive the 3D pose.By adding the middle layer features,the context information of the keypoints is increased,and the ambiguity of lifting2 D hand pose to 3D hand pose is reduced.Related experiments are carried out on three public datasets such as STB,RHD and Dexter+Object,and the effectiveness of this method in hand pose estimation is verified from the metric results and visual effects.
Keywords/Search Tags:High resolution representation, Convolutional neural network, Graph convolutional neural network, Feature fusion, Hand pose estimation
PDF Full Text Request
Related items