Font Size: a A A

Research On Infrared Action Detection Methods Based On Deep Learning

Posted on:2021-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:K HuFull Text:PDF
GTID:2428330614958166Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Temporal action localization aims to automatically locate the interested action in untrimmed video and judge the category of action.Temporal action localization is an extremely important but difficult research task in the field of computer vision,and its research results can be widely used in intelligent surveillance system,massive content retrieval,robot vision,illegal video retrieval and some other fields.This thesis focuses on the problem of insufficient expression capabilities of feature extraction networks and insufficient modeling of action components in existing deep learning-based temporal action localization methods.The specific research works are as follows:Firstly,this thesis proposes an infrared temporal action localization dataset to compensate for the lack of researches on infrared temporal action localization due to the lack of infrared temporal action localization datasets.The proposed infrared temporal action localization dataset covers multiple scenes including different angles and illumination,which can simulate the real environment well.At the same time,two kinds of deep learning temporal action localization frameworks which are commonly used for visible light video data are introduced so that this thesis can conduct research on deep learning temporal action localization.Secondly,action recognition is the core module of temporal action localization,however,in the existing action recognition methods,the feature extraction network generally has the problem of insufficient learning ability Therefore,this thesis proposes an infrared action recognition method based on multi-level balanced feature pyramid,and applies it to subsequent temporal action localization method.The traditional convolutional neural networks tend to deepen the network layer to obtain better expression ability,but ignore the use of shallow features.In this method,parallel convolution blocks are used to construct feature pyramid to maintain different resolution features.At the same time,the thesis uses the same depth of semantic features to generate feature pyramids,so as to balance feature pyramid.Finally,non-local attention mechanism is used to enhance different resolution features.Compared with the traditional deep learning network framework,the experimental results show that this method can simultaneously use features with different resolutions for classification and effectively improve the effect of action recognition.Finally,this thesis proposes an infrared temporal action localization method based on the temporal proposal generation strategy of Gaussian kernel function.This method conducts research on the infrared temporal action localization dataset constructed in this thesis.It absorbs the advantages of existing actionness score grouping algorithms,and makes full use of the temporal information modeling capabilities of Gaussian kernel function.Specifically,Gaussian kernel function is used to learn the expression of temporal proposals of each unit on 1D feature map.At the same time,the Gaussian kernel function grouping algorithm is used to combine the related Gaussian kernel functions to express the new temporal proposals.The experimental results show that the temporal proposals generation method has better action time boundary location ability than other methods,and improves the performance of temporal action localization algorithm.
Keywords/Search Tags:infrared action detection, infrared action recognition, feature pyramid, Gaussian kernel function
PDF Full Text Request
Related items