Font Size: a A A

Object Affordance Learning Based On Relation Perception

Posted on:2021-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2428330602494381Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Affordance refers to the target object's "action possibilities",considering its capabilities and external environment.As the "action possibilities" are closely related to the potential interactions between the environment and the agent,the research of object affordance is of vital significance in the fields of scene understanding,action recognition and so on.This dissertation develops in two aspects of affordance learning,including object affordance segmentation and affordance reasoning.The main contributions are as follows:First,a deep convolutional neural network for regional relationship perception is designed.By perceiving the relation between different regions within the target object,image segmentation based on regional function is achieved.In addition,we propose to combine coordinate convolution and ASPP to refine the extracted features.Different from existing methods based on object detection,our network directly generates pixel-level affordance segmentation maps for input in an end-to-end fashion.The test results on public data sets IIT-AFF and UMD show the superiority of our network.Second,we propose a visual affordance inference method based on spatio-temporal two-stream network,which combines the object's own attributes and the agent's operation intention to recognize affordance of each object in the image.Specifically,the spatial network extracts the features of the target object;the temporal network is fed with frame differences to obtain motion cues to locate the specific operation area;two branches joint to determine the affordance category of the object to solve multiple-class problem.In addition,we visualize the features of each affordance with GradCAM,which assists in evaluating the activated operation area.This model only needs action category labels instead of heavy segmentation labels in the training process,thus greatly improving the practicality of the method.The experimental results on OPRA dataset also prove the effectiveness of the network.To conclude,this dissertation implements visual affordance learning based on relationship perception.On one hand,the relationship between the regions within the object is used to achieve object affordance segmentation.On the other hand,based on the interaction between the operator and the target object,the affordance inference is completed.The proposed methods can effectively overcome the problems of misdetection,missed detection or incomplete segmentation caused by over-reliance on object detection in existing methods.They have great application potentials in the fields of human-robot interaction and autonomous robots.
Keywords/Search Tags:Affordance Learning, Deep Learning, Regional Relation Perception, Video Frame Difference, Two-stream Network
PDF Full Text Request
Related items