Font Size: a A A

Research On Robot Interactive Control Method Based On Visual Fingertip Detection In Complex Background

Posted on:2024-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:J X FengFull Text:PDF
GTID:2568307100960549Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the development of human-computer interaction technology,intelligent robots have begun to replace humans to complete complex and dangerous tasks to ensure that staff can handle various problems safely and effectively.In the process of humancomputer interaction,gesture fingertip interaction is widely used in virtual reality,augmented reality and other fields because of its high degree of freedom and convenience.However,in many applications,the fingertip detection algorithm will encounter problems such as complex and changeable background,dramatic changes in illumination,and selfocclusion of gestures,which will affect the effect of fingertip detection.Therefore,it is still a challenging task to design a fingertip detection algorithm that takes into account both accuracy and robustness.Aiming at the problem of fingertip detection in the process of human-computer interaction,this paper proposes two fingertip detection methods under complex background from the two directions of gesture visual features and gesture segmentation.The two-dimensional image coordinates are converted into threedimensional space coordinates through the principle of binocular stereo vision,and the visual gesture traction control of the manipulator is realized by using the threedimensional fingertip position coordinates.The specific research contents are as follows :(1)In order to improve the accuracy of fingertip detection in complex background,this paper proposes a fingertip detection algorithm based on hand contour component voting.Firstly,the overall nested edge detection algorithm is used to detect the edge of hand image and non-hand image and extract the HOG feature of contour edge to obtain the positive and negative sample training set,so as to reduce the influence of illumination,color and texture on fingertip detection.Secondly,the maximum discrimination feature is selected from the positive sample set by a custom maximum discrimination feature filter and stored as a feature dictionary.The selected maximum discrimination feature and the negative sample set are input into the XGBoost classifier for training to obtain a voting classifier.Then,the KNN algorithm is used to match the classified maximum discrimination feature with the feature dictionary,and the final fingertip position is obtained by the Meanshift algorithm.Finally,the proposed algorithm is tested.The test results show that the fingertip detection accuracy of the proposed algorithm can reach99 %,and the predicted RMSE is maintained in the range of 5 pixels.It is more accurate than the fingertip detection algorithm based on skin color segmentation,YOLO target detection and YOLO-YCb Cr.(2)In order to improve the accuracy of fingertip information acquisition during gesture interaction and reduce the influence of complex background,illumination change and skin-like background on fingertip detection,this paper proposes a fingertip detection algorithm based on semi-supervised generative adversarial network semantic segmentation.Firstly,a hand image semantic segmentation dataset is established by collecting two-view gesture images through a binocular camera.Secondly,the data set is used to train the generative adversarial network.The generative adversarial network algorithm can effectively overcome the problem that the traditional semantic segmentation ignores the correlation between pixels through the adversarial learning of the generator and the discriminator.In the training process,the semi-supervised training method is used to effectively solve the problem of semantic segmentation label acquisition.Then,the trained semantic segmentation model is used to semantically segment the gesture image,and the position coordinates of the fingertips are detected from the semantic segmentation image by the maximum center of gravity distance method.Finally,by setting up comparative experiments,this paper verifies the improvement effect of generative adversarial network on semantic segmentation network,and through the comparative experiments of fully supervised and semi-supervised training,it is proved that semi-supervised learning can reduce the production cost of training set and improve the accuracy of semantic segmentation.(3)In order to obtain the coordinates of fingertips in three-dimensional space and facilitate human-computer interaction through fingertips,this paper proposes a threedimensional reconstruction method of fingertips based on binocular vision.The method mainly includes four steps : binocular calibration,binocular correction,stereo matching and 3D reconstruction.Firstly,the internal and external parameters of the binocular camera are obtained by binocular calibration,and the binocular correction is performed by using the internal and external parameters of the camera.For the stereo matching link,since the algorithm in this paper obtains the fingertip coordinates in the dual-view image through fingertip detection,the fingertip points in the dual-view image are matched,so there is no need for stereo matching.Then,the two-dimensional fingertip coordinates are reconstructed by using the principle of binocular stereo vision,and the spatial coordinates of the fingertip points are obtained.Finally,the three-dimensional reconstruction method proposed in this paper is verified.The experimental results show that the relative error of the three-dimensional reconstruction of the algorithm is within 1.5 %,which lays a foundation for the subsequent fingertip interaction application.(4)In order to verify the feasibility of the fingertip detection algorithm proposed in this paper in the process of manipulator control,this paper designs a vision-based gesture traction control method.Firstly,the fingertip detection algorithm proposed in this paper is used to detect the fingertip of the dual-view image.Then,the three-dimensional reconstruction algorithm of the fingertip point designed in this paper is used to convert the two-dimensional fingertip coordinates into spatial fingertip coordinates.Finally,by comparing the control effects of the non-singular terminal sliding mode control method and the computational torque control method based on PD control,it is proved that the non-singular terminal sliding mode control strategy has faster response speed and stronger anti-interference ability,and can achieve visual gesture traction control well.
Keywords/Search Tags:Fingertip detection, HOG feature, Generating adversarial networks, Binocular three-dimensional reconstruction, Human-computer interaction
PDF Full Text Request
Related items