Font Size: a A A

Study On Few-shot Action Recognition Method Based On Metric Learning

Posted on:2022-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:H YinFull Text:PDF
GTID:2568306323972099Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Action recognition is an important research direction in the field of computer vision,which has important research significance and broad application prospect in the fields of automatic driving,human-computer interaction,virtual reality and and other scenarios.At present,the mainstream action recognition algorithms all need to use large-scale dataset to train models,and such models can only be used to identify the existing action categories in the training set.When new actions need to be identified,it is usually necessary to retrain the model with the annotated samples of the new category.However,in the actual scene,the acquisition of new category labeling samples and the retraining of the model need to consume a lot of time and cost.To solve this problem,combining with the idea of learning with few-shot learning,this thesis proposes a action recognition method to acquire the new category recognition ability through fast learning under the condition that a very small number of labeled samples are given and the model is not retrained.The main work is as follows:(1)This thesis presents an algorithm model of Action recognition based on metric learning.By learning the similarity measure between the features of unlabeled samples and those of different classes,the model can judge the category of the features.Among them,the feature representation of all samples is obtained through the feature embedding network,while the feature of the class is represented by integrating the features of all labeled samples of the class,that is,the input of the network model includes both labeled samples and unlabeled samples.This method transforms the classification problem in the action recognition task into a regression problem,and in the training process,the model learns how to measure the differences among the features.In addition,a data enhancement strategy based on time slice and multi-space center sampling is used to obtain a more complete feature representation of video samples,and channel attention mechanism is used to optimize the process of similarity learning,so as to improve the accuracy of model action recognition.Experiments show that the proposed method has certain advantages compared with other methods in various task scenarios.At the same time,the data enhancement strategy and channel attention mechanism on HMDB51 can improve the accuracy by more than 1%on the basis of the baseline method in this thesis.Similarly,on UCF101,can improve accuracy by 0.4%to 1.2%.(2)This thesis presents an algorithm model for multi-branch and few-shot action recognition based on multi-scale feature.Aiming at the problem that the previous methods failed to make effective use of the potential connection between spatial and temporal features in the process of similarity learning,in this thesis,combining with the idea of twostream convolutional network,we use the two-branch structure to learn the spatial similarity and time similarity respectively,and obtain a better result representation through mutual supervised training.At the same time,in view of the problem that the middle and shallow features can help the model learn to obtain more accurate similarity measures,this thesis combines the idea of feature pyramid and uses the multi-branch structure of multi-scale feature to further improve the performance of the model in the task of fewshot action recognition.The experimental results on HMDB51 show that the proposed method can improve the accuracy rate by more than 1.1%compared with the baseline method in task scenarios such as 5-way 1-shot 5-way 5-shot 10-way 5-shot.Similarly,there was a 0.4%improvement in accuracy on UCF101.
Keywords/Search Tags:Action Recognition, Few-shot Learning, Metric Learning, Attentional Mech-anism, Multi-scale Feature
PDF Full Text Request
Related items