A Research On Multiple Context Based Scene Graph Generation

Posted on:2021-04-07

Degree:Master

Type:Thesis

Country:China

Candidate:Y N Chen

Full Text:PDF

GTID:2428330647451039

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In Scene Graph Generation task,we try to understand the interactions of different objects as a whole,i.e.genertating a scene graph.A scene graph takes objects as graph vertexes and the relation between two objects as edges to generate a structured representation of an image,and it is made up of relationship triplets,such as?person,ride,horse?.A scene graph has objects information and a detailed description of an image.Thus,it includes much more semantic information than objects from object detection task,but has lower-level infromation than abstract description of an image extracted by image captioning task.Therefore,scene graph,as a mid-level semantic information,is often applied to other computer vision tasks,such as object detection,image captioning,image retrieval,text to image,image paragraph generation.Currently,using deep learning to generate scene graph is a common method.There are two sub-problems need to be solved: object detection problem and relation classification problem between two objects.Some existing works can only identify a few types of relationships,and some other works model the context between different relationship triplets,ignoring the association between predicate features of an object pair.In the third chapter of this paper,we propose a two-stage model: predicate feature association network,which utilizes multiple contexts.In the first stage,an common object detector is adopted to obtain object proposals,then we extract the object-level and scene-level contexts to improve the object classification performance.In the second stage,we first utilize multi-modal feature alignment to obtain the alignment context between image region and relation predicate.Then,alignment context and object-level context are combined and fed into a recurrent neural network for obtaining predicate feature association information.Finally,attention mechanism is utilized to get weighted sum ofpredicate feature association information for the predicate classification.Experiments are conducted on the public dataset–Visual Genome dataset,and recall is computed on the top K(K = 20,50,100)predicted relationships with the highest scores.The experimental results show that the proposed method improves the performance.On the basis of predicate feature association network,other two problems about Scene Graph Generation are studied.First,on the step of obtaining object-level context,we studied the fusion methods on multiple features which include visual features,category features and spatial features.Specifically,we use the difference computation based linear fusion technique and the improved Dense Multi-modal Fusion(DMF)which considers the fusion of multi-modal features and perform multi-level fusing.The second study is about the problem that the number of candidate object pair increase quadratically with the number of objects in an image.Thus,based on the idea of multi-level feature fusing,a relationship pair filtering network is proposed in this paper.Because of the effective selection of candidate object pairs,our model can utilize computational resources better in the test phase,and the useless object pairs are largely decreased.

Keywords/Search Tags:

Scene Graph Generation, Context, Recurrent Neural Network, Feature Alignment, Feature Fusion, Object Pair Proposal

PDF Full Text Request

Related items

1	Research On Camouflaged Object Detection Algorithm By Aggregating Multi-scale Scene Context Feature
2	Research On Scene Graph Generation Method With Relation Feature Enhancement
3	Research On Temporal Action Localization Method Based On Dynamic Context Awareness And Feature Alignment
4	Research Of Scene Graph Generation Method Based On Object Relation Enhancement
5	Research On Object Detection Algorithm Based On Feature Fusion And Anti-occlusion Network
6	Research And Implementation Of Entity Alignment Technology Based On Multi-modal Knowledge Graph
7	Research On Context Feature Description Model In Scene Classification
8	Research On Chinese Implicit Sentiment Analysis Method Based On Feature Fusion
9	Object Detection Technology Research Based On Multi-feature Fusion
10	Research On Sensor Activity Recognition Based On Improved Deep Recurrent Neural Network