Font Size: a A A

Research On Collaborative Robot Behavior Framework Based On Operation Question-answering Task

Posted on:2023-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y F WangFull Text:PDF
GTID:2568306758966689Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the continuous progress of artificial intelligence technology,intelligent robots have become the needs of the development of the times.This is very important for robots to have the ability to perceive,make decisions,and act like humans in complex dynamic environments.According to the design requirements of the operation question answering task,this paper relies on the corresponding multimodal task dataset.This paper uses artificial intelligence algorithms to construct a set of robot perception behavior framework that integrates vision and hearing.The practicality of the collaborative robot system in this paper is verified from the simulated environment and the physical world.The innovative work of this paper lies in the following:(1)Aiming at the characteristics of each device of the robot system,this paper proposes a modular manipulator operation platform.This paper uses visual sensors and auditory sensors to collect sensory information from the outside world.The framework combines the ROS distributed operating system to build communication nodes to connect various devices.It reasonably plans the motion control of the end of the robotic arm.And the corresponding operating platform is built in both the simulation environment and the physical world.(2)This paper proposes a transferable reinforcement learning operation question answering algorithm.For feature fusion between different modalities,the bilinear pooling fusion algorithm is used to capture the fine-grained mapping between the two features.And it can complete the fusion of text features and image features.This paper uses a domain randomization algorithm to change the environmental characteristics of the training.It combines the recurrent adversarial neural network to complete the migration from simulation to real image.And the effectiveness of the whole system can be verified by means of simulation environment and physical reality environment.(3)This paper presents a new audio-visual manipulation task.In this paper,a multimodal dataset is constructed according to the requirements of the operation task.It uses visual information and audio information to interpret the operational instructions that indicate the expression.For operating instructions with specific targets,the system combines image information and uses deep learning algorithms to locate the specified target area.For visually indistinguishable objects,the system can add auditory sensors.In this paper,the different shaking motions of the manipulator are used to obtain the audio information of the collision of objects.And it also constructs the corresponding classification algorithm to complete the audio recognition and it is verified in the actual robot.The experimental results show that the multimodal data can significantly improve the efficiency of robot operation.
Keywords/Search Tags:Collaborative Robot, Robot Perception, MQA, Multimodal Fusion, Referring Expression
PDF Full Text Request
Related items