Font Size: a A A

Research On Video Object Tracking Algorithm Based On MDnet

Posted on:2020-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:B Y WangFull Text:PDF
GTID:2428330602450662Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
As one of the research contents in the field of computer vision,object tracking technology has been widely used in various fields of modern society.Many Convolutional Neural Network—CNN based trackers have been proposed,and Recurrent Neural Network—RNN capable of capturing long-term dependencies between sequential data has also been introduced into the field of computer vision.Multi-Domain Convolutional Neural Networks—MDNet is an online tracking method based on multi-domain CNN architecture.It samples candidate regions by pre-training CNN on large-scale data sets and fine-tuning the first frame of the video.However,since each candidate in MDNet is processed independently,it has high computational complexity in terms of time and space,which leads to slow tracking.In addition,MDNet is based on CNN and treats tracking as a classification problem.The Focus is mainly on inter-class classification.In the presence of interference,MDNet is likely to misclassify objects and backgrounds.This paper studies the two points of MDNet and proposes a new MDNet-based tracking algorithm.The main work of this paper is as follows:(1)Improvement of network structure based on Ro I Align: When extracting the features of the tracked object,the original MDNet first generates candidate regions,and then uses the candidate regions to extract features on the original image,which has high computational complexity.Aiming at this problem,this paper proposes a new algorithm—MD-RA,which adopts the feature extraction method of Ro I Align.Since Ro I Align itself is rough when extracting features,some useful information may be lost.MD-RA re-adjusts the unit size of Ro I Align calculation according to the front and rear Ro I width when using Ro I Align.In addition,by removing the max pooling layer,the method of expanding convolution is used to increase the receptive field of each point on the feature map to enhance the expression ability of the feature map.This improvement has resulted in a 3.3% and 1.6% reduction in OPE accuracy and success rate for MD-RA compared to MDNet,but an increase of approximately 9.2 times in tracking speed.(2)The integration of RNN features on the basis of MDNet: MDNet is based on CNN,so it is sensitive to similar interferers,and RNN can capture long-term dependencies of frames before and after the object in sequence data.In this paper,RNN is used to model the structure information of the object,and then the RNN feature and CNN feature of the tracked object are merged to enhance the discriminating ability of the tracking network between the tracked object and similar interferents.(3)Improvement of the loss function: There is only one loss term in the original MDNet,and the anti-interference ability is weak.In response to this problem,this paper introduces a new loss term,which is to make the objects in different domains far away from each other in the shared feature space,and to learn the objects that are invisible in the current domain in the new test sequence.Discriminate representations to improve MDNet's ability to identify similar interferers.Based on the above improvements,a new algorithm IMP-MD is proposed.Through experiments,IMP-MD has improved the accuracy and success rate of OPE compared to MD-RA by 3.7% and 2.0%,and the speed is 7.8 times higher than that of MDNet,which has high application value.
Keywords/Search Tags:Object Tracking, MDNet, CNN, RNN, RoI Align
PDF Full Text Request
Related items