Research On Action Segmentation Based On Human Skeleton

Posted on:2024-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:Q H Tan

Full Text:PDF

GTID:2568307151460114

Subject:Electronic Science and Technology

Abstract/Summary:

PDF Full Text Request

Action segmentation is a challenging task in the field of computer vision,which aims to segment continuous actions into sub-action segments with their starting and ending time points in an untrimmed long video.Currently,the mainstream methods for action segmentation can be divided into two types according to the input data format: feature-based action segmentation using I3 D dual-stream features and skeleton-based action segmentation directly extracting action features from skeleton keypoint information.Due to some limitations of I3 D feature extraction observed in experiments,this paper focuses on skeleton-based action segmentation task and researches three methods specifically designed for skeleton data.Firstly,a method based on spatio-temporal graph convolution and dynamic time warping is researched for action segmentation,which is mainly designed for Tai Chi actions.Specifically,this method consists of two steps.Firstly,a spatio-temporal graph convolutional network is trained on a self-made Tai Chi skeleton dataset,and then the network is used for action classification with a sliding window input to obtain an initial action segmentation structure.Secondly,hand-crafted features are designed to represent Tai Chi actions based on the position information of skeleton key points.Then,a time-varying alignment algorithm is applied to compare the reference action feature curve with the sample action curve based on the initial segmentation result.Finally,the action boundaries are redefined according to the comparison result to generate the final segmentation result.The feasibility of this method for Tai Chi action segmentation task is demonstrated through experiments and analysis on a self-made Tai Chi dataset.Secondly,a frame-level action segmentation method based on spatial graph convolution and cascaded networks is researched.This method combines spatial graph convolution with multi-stage cascaded temporal convolution,which enables the network model to capture the spatial motion information and long-term temporal dependency of continuous actions in skeleton data.The model was experimented and analyzed on the PKUMMDv2 and LARa datasets,and the results of the experiments validated the effectiveness of this method.Finally,a frame-level action segmentation method based on dual dilated temporal convolution is researched,which improves the temporal convolution structure based on the previous method using spatial graph convolution and cascaded networks.Specifically,dual dilated residual layers are introduced in the initial stage of the cascaded networks,which enables the model to achieve higher frame accuracy.This method is tested and analyzed on PKU-MMDv2,LARa,and a self-made Tai Chi dataset,and compared with advanced methods.The experimental results demonstrate that the proposed method achieved outstanding performance.

Keywords/Search Tags:

action segmentation, skeletal data, spatial graph convolution, temporal convolution, cascaded network

PDF Full Text Request

Related items

1	Action Recognition Based On Human Skeleton Graph Convolution And Image Convolution Fusion
2	Research On Action Recognition Methods Based On Lightweight Motion Aggregation And Dynamic Spatial Temporal Graph Convolution
3	Research On Human Action Recognition Based On Spatial-Temporal Graph Convolution
4	Research On Human Action Recognition Algorithm Based On Spatio-temporal Graph Convolutional Network
5	Human Action Recognition Based On Spatio-temporal Graph Convolution Network
6	Research On Human Action Recognition Method Based On Adaptive Spatial-temporal Fusion Graph Convolutional Network
7	Research For Action Recognition Based On Spatial-Temporal Stream Convolution Neural Networks
8	Research On Human Skeleton Action Recognition Based On Graph Convolutional Network
9	Research And Implementation Of Action Representation Learning Based On Human Skeletal Data
10	Human Skeleton Action Recognition Based On Spatiotemporal Graph Attention Convolution Network