Construction And Application Of Complex Behavior Video Datasets

Posted on:2022-07-08

Degree:Master

Type:Thesis

Country:China

Candidate:X Ma

Full Text:PDF

GTID:2518306494986999

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid growth of video and the success of deep learning,the understand-ing of human video actions has become an extremely hot and challenging area in com-puter vision,especially,for action recognition and action temporal detection.Due to the complexity of human actions,Action recognition and action temporal detection are generally regarded as high-level problems in the field of video understanding,most of the existing high-level video action recognition methods lack the use of details and fine-grained middle-level semantic information,and the uncertainty of boundary proposals also leads to the difficulty of extraction accuracy for the existing video action tempo-ral detection methods.Therefore,video action recognition and video action temporal detection methods require more fine-grained and accurate datasets to support.Exist-ing video datasets ignore the middle-level understanding of all body parts and coarse instances with uncertain boundaries interfere with proposal generation and action pre-diction.Therefore,in this paper,construct two more fine-grained,accurate datasets in spatial and temporal dimensions.To further deepen the understanding of actions,this paper delves into interpretable action recognition in video by explicitly disentan-gling human actions into the spatio-temporal composition of body parts and interacting objects.Specifically,a large-scale ExplainAction benchmark is built for this study,pro-vides 9.5 million annotations of 10 body parts,8.7 million gestures,and 230 interactive objects at the frame level,meanwhile,provides new opportunities to understand human actions by learning the components of body parts in videos.With ExplainAction,a compositional approach with interpretability can be further exploited to improve action recognition performance.On the other hand,this paper develops RefineAction,a new large-scale refined video dataset collected from existing video datasets and web videos Specifically,RefineAction contains 139K refined action instances,densely annotated in nearly 17,000 untrimmed videos across 106 action categories.Compared with exist-ing action localization datasets,Refine Action has finer action category definitions and high-quality annotations to reduce boundary uncertainties.And it is shown through ex-periment results that the overlap of instances and the diverse durations of RefineAction bring new challenges for action timing detection.

Keywords/Search Tags:

Video Understanding, Action Recognition, Action Detection, Pattern Recognition, Deep Learning

PDF Full Text Request

Related items

1	General Interactiing Object Detection Algorithms For Action Understanding
2	Design And Implementation Of Context Cascade Network For Video Temporal Action Detection
3	Research On Coarse-to-fine Action Understanding Technologies For Video
4	Research And Implementation Of Video Action Detection Task Based On Deep Learning
5	Research On Action Recognition In Video-Skeleton Sequences Based On Deep Learning
6	Research On Video Action Recognition And Detection Method Based On Deep Learning
7	Research On Human Action Detection And Recognition Algorithm For Video
8	Research On Human Action Analysis And Recognition Method Based On Deep Learning
9	Action Recognition In Video Based On Deep Neural Network
10	The Research And Application Of Action Detection And Recognition In Online Video Surveillance