Font Size: a A A

Video Classification Based On Graph Convolutional Neural Network

Posted on:2022-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:S S DuFull Text:PDF
GTID:2568306488980119Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Video classification,especially group activity video classification,has attracted a great deal of attention from researchers due to its wide range of practical applications.It can be applied to security surveillance,intelligent video understanding,and sports video analysis.Group activity videos involve multiple individual behaviors,the purpose is to infer the category of group activity by capturing the spatio-temporal evolution and interactions among individuals in the videos.This thesis focuses on two aspects:constructing unbalanced interactions in group activities and learning group activity feature representation.The diverse spatio-temporal interactions of individuals are built up and then inferred.The multi-level interactions of group activities are represented and learned finally.The main work of this thesis includes:(1)A Spatio-Temporal Interactive Graph(STIG)model based on graph convolutional network is proposed in this thesis,which can explore various interaction relationships with individuals adaptively.Firstly,multi-view relational graphs are constructed based on the semantic features and spatial location information of individuals.Then,the Relationship-Fusion Block(RFB)aggregates the location interactive graph and the semantic interactive graph,and extends the fused graph to the time-domain space to construct spatio-temporal interactive graphs of group activities.Finally,the diverse interactions of group activities are inferred from the graph convolution layer,and individual behaviors and group activities are classified.Quantitative and qualitative experimental results on two challenging public datasets indicate that the proposed method can obtain higher classification accuracy.(2)A Hierarchical Interactive Graph(HIG)model based on graph convolutional neural networks and graph pooling is proposed in this thesis,which can divide group activity into multiple levels,adaptively construct multi-grained relational interactive graphs,and learn hierarchical features of group activity in an end-to-end manner with fewer parameters.First,the spatio-temporal interactive graph of individual relationships is established based on the location relationship and individual attributes.Then,the attention scores of individuals are calculated via graph convolutional network with the consideration of the node features and graph topology.Individuals with high attention scores are regarded as key individuals in the group,and a higher-level interactive graph is established by retaining key nodes.Finally,we linearly fuse the outputs of the readout layers to obtain a multi-level feature representation of the group activity video.Experimental results on two challenging group activity datasets demonstrate the effectiveness of this method.
Keywords/Search Tags:Group activity video classification, Graph convolutional neural network, Interactive inference, Graph pooling
PDF Full Text Request
Related items