Font Size: a A A

Research On Action Detection Method Based On Untrimmed Video

Posted on:2020-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:R YinFull Text:PDF
GTID:2558307109973219Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Action Recognition is a type determination of a single action in a pre-segmented video sequence.At present,the video action recognition technology has achieved a relatively high recognition rate.However,on the one hand,the motion recognition method relies heavily on pre-trimmed video,which is time-consuoming and labor-intensive;on the other hand,the video in reality is generally untrimmed and the time span is different.Moreover,it usually contains a large amount of irrelevant information.In this case,the motion recognition task is greatly lim ited in pr actical applications.In order to solve the above problems,it is necessary to perform event detection on untrimmed video,so the research on the event detection method of untrimmed video has great practical significance and application value.Event detection based on untrimmed video refers to locating a specific action segment from a complete video with a long span of time and containing multiple action segments and backgrounds,and identifying the specific category of the action segment.In this paper,the event detection method of untrimmed video is studied.The main work is as follows:1)P erform fixed-length unit decomposition on untrimmed video;2)Extract forward-to-back motion history image and dense optical flow image for each video unit And the RGB image of the center frame,and the pseudo-color processing is performed by using the rainbow coding technology on the extracted forward and backward motion history images,and then the CNN features are respectively extracted from the above three images by using the CNN network;3)constructing the multi-scale CNN feature of the con text:The three kinds of CNN features extracted above,with each video unit as the anchor unit,construct feature pyramids with scales of {1,2,4,8,16,32} video units respectively,and then before and after the different scale features.Several unit features are extracted and concatenated in series to obtain left-center-right segment-level features based on context information.4)Timing action proposal:The fusion results of two multi-scale CNN features are sent to the TURN unit regression network for timing.Motion detection,nominated for a number of timing actions(including time boundary information,specific categories unknown);5)moving Recognition:The sequence operation network has been nominated TURN into CBR network and refinement of the motion recognition time border,arrive at an accurate start time for each segment of the operation and the specific category name.The method proposed in this paper was verified on the THUMOS 2014 dataset:the average recall rate(AR@200)of the TURN network proposed by Jiyang Gao in 2017 was 42.72%,and AR@200 of this paper.It is 43.92%;the average accuracy(mAP@0.5)of the CBR network proposed by Jiyang Gao in 2017 when the overlap rate threshold is 0.5 is 21.9%,and the mAP@0.5 of this paper is 23.2%.The experimental results show that the average recall rate and average accuracy of the proposed method in the THUMOS 2014 dataset are higher than the two networks proposed by the author.
Keywords/Search Tags:Action detection, Motion History Image, Rainbow Coding, CNN, Temporal action detection, Action classification
PDF Full Text Request
Related items