| Visual object tracking aims to compute the object position of each frame at continuous video frames or online videos,which is a basic and important research topic in computer vision field.It has a wide range of applications in the fields of target guidance,automatic driving,activity recognition and so on.Single modal object tracking based on visible light,has obtained abundant research results in recent years,which is one of the most important research topics in visual object tracking.Researchers have proposed a variety of tracking algorithms based on different theoretical frameworks,and these algorithms have improve the performance of object tracking algorithms in two terms of time and accuracy.Moreover,a standard visible light object tracking data set which contains a variety of complex conditions has been established,to evaluate the performance of these different object tracking algorithms.These works not only lay the theoretical foundation of visible light single modal object tracking algorithms,but also have a wide range of applications in practical projects.Although the single modal visible light object tracking algorithms have an excellent tracking performance in many complex tracking scenarios nowadays,they may fail in tracking under some extreme conditions,such as low illumination or zero illumination.In order to solve this problem,researchers introduce the thermal infrared image or RGB-D image information to make up the lack of visible light single modal video data.In recent years,researches about the multi-modal object tracking algorithms based on the thermal infrared videos and visible light videos have received many attentions cause the favorable complementarity between visible light videos and thermal infrared videos.In this thesis,we study the multi-modal object tracking algorithms based on thermal infrared videos and visible light videos,which have the main contributions as follows:(1)We propose a multi-modal object tracking algorithm based on modal reliability correlations.Due to the different mechanism of thermal infrared videos and visible light videos,the tracking object under different video modals has different weight.Once computing the weight of different video modals,the traditional visible light single modal object tracking algorithms can track the object with the better ones.So we present a real-time multi-modal object tracking algorithm based on a self-designed video modal reliability criterion.This algorithm can take the full advantage of the thermal infrared and visible light video information adaptively to achieve a robust tracking result.In the tracking process,a well-design update scheme makes the tracking modal adapt to the change of the target appearance,which reduces the influence of noise.(2)We propose a multi-modal collaborative object tracking algorithm based on local and global information fusion.In the multi-modal tracking process,different video modals have different weights obviously,furthermore,different local image patches of tracking samples have different contributions to the tracking results.So taking the different modal weight and different tracking sample local patches weight into account,we propose a collaborative object tracking algorithm based on different modal data fusion.This modal takes the advantage of the internal relationships between the object samples and their local image patches by the joint sparse representation learning.In addition,the proposed model preserves the spatial layout structure among the local patches inside each target candidate.Moreover,each target local patches are weighted according to their different contributions for tracking.In the end,the modal weight is joint sparse with the whole object tracking sparse representation appearance modal.(3)We establish a multi-modal object tracking data set which contains many complex tracking conditions.Current public multi-modal video data sets,such as OSU,ACI and so on,are hard to evaluate the multi-modal object tracking algorithms due to their simple scene as well as less video sequences.In this thesis,we establish a multi-modal object tracking data set which contains many complex tracking conditions,such as low illumination and background clutter,to evaluate our multi-modal object tracking algorithms.In this data set,it contains low illumination single man walking,two man cross occlusion,single bicycle moving and other challenging factors.The original multi-modal video data scene is aligned,the tracking object is manually labeled,which forms a multi-modal object tracking evaluation data set. |