Font Size: a A A

Research On Spatial-Temporal Information Mining Based Video Object Detection And Tracking

Posted on:2022-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:L J LinFull Text:PDF
GTID:2568306323477444Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As two fundamental problems in computer vision,video object detection and tracking have a variety of applications,such as video surveillance,autonomous driving,and augmented reality.However,due to various factors(such as occlusion,motion blur,video defocus)in real scenes,the performance of video object detection and tracking is still limited.Different from still images,video sequences contain rich spatial-temporal information.The performance of video object detection and tracking can be effectively improved by mining the rich spatial-temporal information in videos.Therefore,this thesis proposes a video object detection method and an object tracking method based on spatial-temporal information mining.The main contributions of this thesis are as follows:(1)A novel dual semantic fusion network for video object detection is proposed.The proposed method aims at improving the video object detection performance by mining and fusing the rich spatial-temporal information in videos.Specifically,this thesis proposes a dual semantic fusion network(DSFNet),which performs a multi-granularity semantic fusion at both frame level and instance level and then generates enhanced features for video object detection.Moreover,this thesis proposes a new geometric similarity measure to alleviate the information distortion caused by noise during the fusion process.Experimental results demonstrate that the proposed DSFNet achieves state-of-the-art video object detection performance.(2)A robust tracking method via statistical positive sample generation and gradient aware learning is proposed.Considering the bad generalization ability of tracking methods caused by limited online training samples,this thesis proposes a positive sample generation algorithm.This algorithm mines the rich spatial-temporal information of the target in a video sequence by estimating the distribution of positive samples,and then generates diverse online training positive samples based on the spatial-temporal information.Moreover,this thesis proposes a gradient sensitive loss to alleviate the imbalance problem between easy and hard samples in the generated positive samples and the samples collected during online tracking.Based on the proposed positive sample generation algorithm and gradient sensitive loss,this thesis proposes a robust tracking method via statistical positive sample generation and gradient aware learning.The proposed method achieves promising performance on several benchmark datasets.
Keywords/Search Tags:Spatial-Temporal Information, Object Detection, Object Tracking, Semantic Fusion, Sample Imbalance
PDF Full Text Request
Related items