Research On Key Technologies Of Human-Object Interaction Detection

Posted on:2024-01-10

Degree:Master

Type:Thesis

Country:China

Candidate:W H Yang

Full Text:PDF

GTID:2568306914465554

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,Human-Object Interaction(HOI)detection has attracted rising attention.Given an image or a video,HOI detection aims to localize human-object pairs and recognize the interactions between them,so the task plays an important role in scene understanding and anomaly detection in real scenarios,such as anomaly detection in surveillance videos.The thesis focuses on the key technologies of HOI detection,and the research mainly includes the following three aspects:1.In the aspect of interaction-related feature extraction,an interaction-centric graph parsing network is proposed for HOI detection.Given an image,the multi-relation graph convolutional network models one human node as a central node,and other nodes as semantic nodes,which is generated from the proposed interaction-related feature construction module.Furthermore,a multi-IOU(Intersection Over Union)random shift scheme is proposed to augment the data of the training set,and enhance the generalization ability of the network.2.In the design of scene features,a model named multi-modal feature enhancement network with Transformer is proposed.Specifically,a feature fusion module is constructed to generate different interaction features,and the multi-modal scene descriptors are fused to strengthen the contextual expression of interaction features.3.For the problems of long-tailed distribution and noisy labels in datasets,the model jointly supervised by cluster labels and real labels is proposed for long-tailed learning.In addition,to ensure that the model is not penalized too much for predicting missing but correct labels in HOI datasets,a loss function based on the uncertainty of model predictions is constructed.The performance of HOI detection model is continuously improved by optimizing the feature representation,the model structure and the quality of datasets.Extensive experimental results on the HICO-DET and V-COCO datasets imply the effectiveness and generalization of the above algorithms.

Keywords/Search Tags:

Human-Object Interaction Detection, Graph Neural Network, Scene Understanding, Long-Tailed Learning, Noisy Label

PDF Full Text Request

Related items

1	Research On Scene Understanding Algorithms Based On Graph Neural Networks
2	Research On Key Techniques Of Visual Scene Understanding And Interaction
3	Researches Of The Activity Understanding Based On Dynamic Representation Learning
4	Research On Feature Extraction Of Multi-label Text Classification
5	Human-object Interaction And Video Understanding Under Complex Scenarios
6	Indoor Scene Understanding Based On Convolutional Neural Network And 3D Geometric Context Information
7	Deep Learning 3D Object Detection
8	The Research On Noisy Label Problems Based On Label Distribution
9	The Research Of Scene Understanding Neural Network Model
10	Research On Partial Multi-label Learning Algorithm With Application To Image Semantic Understanding