Font Size: a A A

Research Of Multi-modal Data Based Object Tracking

Posted on:2018-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:X Z ZhangFull Text:PDF
GTID:2428330566452228Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the mobile phone and surveillance cameras and other video capture devices widely used,a lot of video data are produced.Intelligent analysis and mining of video data can fundamentally solve the problem of the massive video data analysis.As an important research direction in the field of computer vision,visual object tracking plays an important role in intelligent video analysis and mining.Visual object tracking has a wide range of applications in real life,such as security,driverless car,etc.So it has a very great application value.Visual object tracking is susceptible to interference factors such as brightness change,deformation,motion blur and size change,and there are still many problems in it.So it has very important research value.Two object tracking algorithms are proposed in this thesis;they not only make full use of the powerful feature representation of deep neural network,but also take advantage of the multisource and complementarity of multi-modal data.Two object tracking algorithms respectively are object tracking based on multi-modal data and convolutional neural network,and object tracking based on multi-modal data and fully-convolutional Siamese network.In order to verify the effectiveness of these algorithms,we collected a multi-modal database OptTrack,which also includes visible images and infrared images.There are six video sequences in OptTrack.The object tracking algorithm based on multi-modal data and convolution neural network adopts the dual fusion strategy,it not only combines the spatial information of shallow feature maps and the semantic information of deep feature maps,but also fuses the visible image and the infrared image at the algorithm level.The algorithm can be divided into two steps.Firstly,the target position is predicted by applying the translation correlation filter to the multi-layer convolution feature maps from the visible image.Secondly,the target size is estimated on the scale pyramid of the infrared image.We compare the 10 target tracking algorithms on six video sequences in OptTrack.The experimental results show that the algorithm is robust and superior to all other algorithms.In order to solve the problem that the traditional tracking algorithm based on deep neural network is slow,this paper proposes an object tracking algorithm based on multi-modal data and fully-convolutional Siamese network.Firstly,the fully-convolutional Siamese network is used on the visible image;the target position can be predicted with one forward propagation.Secondly,the target size is estimated on the scale pyramid of the infrared image.Compared with10 object tracking algorithms,it is proved that the algorithm performance is good.In addition,its tracking speed is fast,the average tracking speed is about 19 fps.
Keywords/Search Tags:Computer vision, visual object tracking, multi-modal data, convolutional neural network, siamese network
PDF Full Text Request
Related items