Research On Multimodal Gesture Data Generation For Complex Scenes

Posted on:2024-05-03

Degree:Master

Type:Thesis

Country:China

Candidate:Z X Wang

Full Text:PDF

GTID:2568307094977029

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Human-computer interaction(HCI)refers to the way in which humans and machines interact with each other through specific means.Gesture recognition is one of the most natural forms of communication,offering advantages such as being natural,direct,and efficient.Gesture recognition algorithms are the core technology that allows for gesture interaction.The goal is to interpret the user’s intention from the collected hand data,which is in great demand in various fields such as virtual reality,augmented reality,elderly assistance,and intelligent homes.With the rapid development of deep learning,the accuracy and stability of gesture recognition algorithms have significantly improved,and researchers have higher demands for gesture data.However,most traditional gesture data are based on images of bare hands,which can result in problems such as poor algorithm robustness,high investment costs for three-dimensional annotation,and limited model verification in real scenes with dim lighting.Therefore,there is a growing need for multimodal gesture data generation in complex scenes.In this paper,multimodal gesture data synthesis methods for complex scenes are studied.The main research contents and results are as follows:(1)A set of multi-modal gesture acquisition platform is built,including multi-modal gesture acquisition equipment and multi-modal gesture acquisition program.The platform can collect RGB image,depth image and hand IMU data at the same time.Meanwhile,multi-threading can be used to disperse the data processing pressure,so as to improve the collection rate.The experimental results show that the collected data have high accuracy and synchronization.(2)A set of multi-view synchronous annotation software is developed,which can simultaneously annotate four images in the same group and view the annotation results in real time.Compared with traditional annotation software,the developed software can better solve the problem that hand data self-occlusion is difficult to annotate,and has many advantages such as intuitive annotation results,convenient adjustment of annotation position and high annotation efficiency.The experimental results show that the software achieves better results in reducing the errors of the acquisition platform and the labeling software.In addition,the software utilizes the existing 2D key-point attitude estimation network to automatically generate 3D key-point data in batches,greatly reducing the labor and time cost required for annotation.(3)Based on the unsupervised generative adversarial network,an image selection module is proposed,which can adjust the hand image by using the unpaired data set according to the image background in the generated synthetic data set,which can effectively improve the inconsistency between the hand brightness and the background and the uneven edge of the hand,and make the synthetic image closer to the real image.Experimental results show that the fusion effect of this method is better than other mainstream image fusion algorithms.To sum up,based on the task of multi-modal gesture data generation in complex scenes,this paper builds a multi-modal data acquisition platform,develops multi-perspective synchronous annotation software,and proposes an unsupervised image fusion method,which has been verified by experiments and obtained good results.

Keywords/Search Tags:

Multi-Model Data Synchronization, Multi-Perspective Annotation Software, Adversarial Generation Network, Image Fusion

PDF Full Text Request

Related items

1	Image Inpainting Based On Dynamic Parameter Selection Multi-exposure Fusion And Generation Adversarial Network
2	Research On Multi-View Robust Adversarial Example Generation Method
3	Research On Text Guided Image Generation Method Based On Adversarial Learning
4	The Construction And Recognition Of Equipment Operating Key Sample Set Based On Interactive Intelligent Annotation
5	Image Encryption Communication Model Based On Multi-Attacker Adversarial Neural Network
6	Research On Text-to-image Generation Based On Multi-stage And Multi-task Generative Adversarial Network
7	Research On Image Inverse Halftoning Method Based On Multi-scale Feature Fusion Of Gan Network
8	Research And Implementation Of Three-dimensional Annotation’s Multi-views Expression
9	Research On Multi-focus Color Image Fusion Algorithm
10	Research On Image Generation Models Based On Generative Adversarial Networks