Object Affordance Learning Based On Relation Perception

Posted on:2021-01-16

Degree:Master

Type:Thesis

Country:China

Candidate:X Zhao

Full Text:PDF

GTID:2428330602494381

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Affordance refers to the target object's "action possibilities",considering its capabilities and external environment.As the "action possibilities" are closely related to the potential interactions between the environment and the agent,the research of object affordance is of vital significance in the fields of scene understanding,action recognition and so on.This dissertation develops in two aspects of affordance learning,including object affordance segmentation and affordance reasoning.The main contributions are as follows:First,a deep convolutional neural network for regional relationship perception is designed.By perceiving the relation between different regions within the target object,image segmentation based on regional function is achieved.In addition,we propose to combine coordinate convolution and ASPP to refine the extracted features.Different from existing methods based on object detection,our network directly generates pixel-level affordance segmentation maps for input in an end-to-end fashion.The test results on public data sets IIT-AFF and UMD show the superiority of our network.Second,we propose a visual affordance inference method based on spatio-temporal two-stream network,which combines the object's own attributes and the agent's operation intention to recognize affordance of each object in the image.Specifically,the spatial network extracts the features of the target object;the temporal network is fed with frame differences to obtain motion cues to locate the specific operation area;two branches joint to determine the affordance category of the object to solve multiple-class problem.In addition,we visualize the features of each affordance with GradCAM,which assists in evaluating the activated operation area.This model only needs action category labels instead of heavy segmentation labels in the training process,thus greatly improving the practicality of the method.The experimental results on OPRA dataset also prove the effectiveness of the network.To conclude,this dissertation implements visual affordance learning based on relationship perception.On one hand,the relationship between the regions within the object is used to achieve object affordance segmentation.On the other hand,based on the interaction between the operator and the target object,the affordance inference is completed.The proposed methods can effectively overcome the problems of misdetection,missed detection or incomplete segmentation caused by over-reliance on object detection in existing methods.They have great application potentials in the fields of human-robot interaction and autonomous robots.

Keywords/Search Tags:

Affordance Learning, Deep Learning, Regional Relation Perception, Video Frame Difference, Two-stream Network

PDF Full Text Request

Related items

1	Research On Video Frame Interpolation Based On Deep Learning
2	Research On Imitation Learning Of Robot Manipulation Tasks Based On Video Semantic Information
3	Video Target Detection Based On Deep Learning And Their System Implementation
4	The Research And Application Of Hierarchical Reinforcement Learning And Affordance Model
5	Research On Video Codec Technology Based On Deep Learning
6	Research Of Video Super-resolution Method Based On Deep Learning
7	Deep Learning Based Video Frame Interpolation Method
8	Regional Management Intelligent Video Analysis System Based On Deep Learning
9	Research On Deep Learning Based Video Frame Interpolation Algorithm
10	Deep Learning Based Key Frame Detection For Sport Video