Research On Object Grasp Affordance Prediction And 3D Reconstruction Based On Deep Learning

Posted on:2024-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:Z S Wu

Full Text:PDF

GTID:2568307151460634

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The problem of grasp skill learning in clutter is a hot topic in the research field of cooperative robots,and scene understanding is the premise for robots to perform grasp correctly.Robots need to infer 3D scenes from incomplete perception to generate more stable grasps.In the real environment,due to the limited workspace,the complete geometry structure information of the object cannot be obtained,and can only be obtained from the limited information and part of the perspective.3D reconstruction is one of the means for robots to understand the work scene.Reconstruction supervision helps capture perceptual geometry features and obtain richer feature representations.The multi-task learning method is used to jointly train the grasp affordance and 3D reconstruction.The concrete research contents are as follows.Firstly,to address the issues of limited capture of geometric information from limited perspectives and low model generalization,a grasp affordance prediction and 3D reconstruction(3D-GAPR)algorithm based on U-MF and implicit neural representation(UMF-INR)is proposed.Adopting a hard parameter sharing method to simultaneously train grasp and reconstruction,improving the model’s generalization ability.Design a U-MF module to extract shared features,achieve interaction between local and global features,and enhance the expressive power of shared geometry features.Implementing differentiable training for two tasks based on implicit neural representation,where computing resources can be adaptively allocated to tasks that are more difficult to train.Then,a 3D-GAPR algorithm based on positional encoding(PE)and hierarchical USwin T is proposed to address the issues of position sensitivity,instability,and low U-MF sharing efficiency in grasp.Encode the position of voxels through a set of learnable parameters,embed the position information into the original data,learn easily grasped positions,and improve the robustness of grasping.Design a hierarchical U-Swin T module based on Swin Transformer to integrate global and local information,generate multi-scale feature representations,and improve feature sharing efficiency.Finally,the PyBullet simulation environment is used to collect data from Packed and Pile scenarios in a self-supervised manner for training the model.Perform grasp in the simulation environment to verify the effectiveness of introducing U-MF-INR,positional encoding,and hierarchical structure for grasping.For a more intuitive observation,visualize the affordance heat map of grasping and further verify the role of positional encoding and reconstruction in grasp.Simultaneously visualize the reconstruction results and verify the embedding of position information to depict the reconstruction details.

Keywords/Search Tags:

multi-task learning, hard-parameter sharing, grasp affordance prediction, 3D reconstruction, implicit neural representation

PDF Full Text Request

Related items

1	Research On Multi-Task Sentiment Classification Algorithm Based On Hard Sharing Mechanism
2	Research On Generative Grasp Detection Method In Unstructured Environment
3	Deep-learning- Based Robot Grasp Detection Using RGB-D Sensor
4	Research On Implicit Representation And Multi-task Recommendation System Based On Knowledge Graph
5	Research On Sampling Strategies For Implicit Neural Representation In 3D Reconstruction
6	Robust Point Cloud Reconstruction Based On Self-supervised Learning
7	Research On High-precision 3D Reconstruction Method Of Irregular Objects Based On Computer Vision
8	Research On QUIC Traffic Classification Method Based On Multi-Task Deep Learning
9	Research On 3D Reconstruction Method Based On Implicit Representation
10	Human Pose Estimation Method Based On Multi-task Learning