Font Size: a A A

Research On Object Tracking Methods In Complex Sports Scene

Posted on:2021-02-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:W N WuFull Text:PDF
GTID:1487306458477384Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Object tracking is a popular topic in vision technology research area with wide range of practical applications,such as intelligent surveillance,automatic driving,robot visual perception,and etc.Recently,with the rapid development of the sports industry,visual tracking for targets(ball and players)in complex sports scene represented by basketball or soccer has gradually attracted attention.However,compared with the pedestrian surveillance scene which has been the most widely studied,the issue of object tracking in sports scene is more challenging,such as more serious occlusion,more similar appearance interference,more pose variation,more complex movement,and etc.To solve these problems,this dissertation takes the subject of ball(single object)tracking and players(multiple objects)tracking in complex sports scene like basketball and soccer as the research issue.Based on the analysis of the reason why common tracking methods are not suitable for sports scene and the consideration of the latest visual detection and tracking techniques,this dissertation proposes a variety of ball and player tracking methods for complex sports scene.This research is expected to build the foundation for higher semantic research tasks in the field of sports video analysis including action recognition,event detection,and content understanding.The main work and the key contributions of this dissertation are summarized as follows:(1)To handle the problems of missed or false ball detections and tracking drift due to the small size of the ball,serious occlusion,background interference and illumination variation in complex sports scene,a three-dimensional ball tracking method based on small object detection and multi-view fusion technique is then proposed.The proposed method mainly consists of four parts:2D ball detection,2D ball tracking,3D position fusion,and 3D trajectory smoothing.In the stage of 2D ball detection,a ball detection network based on multi-scale deep features is proposed and some ball detection failures are thus improved.In 2D ball tracking procedure,the problem of tracking drift due to occlusion is largely alleviated by introducing cross-view information based on epipolar constraints and a proposed detection-based model update strategy.Afterwards,the triangulation algorithm is then adopted to fuse the 2D coordinates of the ball from multiple camera views into 3D space.Finally,the nonlinear motion of the ball is specially simplified to apply the Kalman filter to ensure a smooth ball trajectory.Both the 2D and 3D tracking accuracy of the proposed ball tracking method on a basketball public dataset can be largely improved to 0.81 and 0.92,respectively.(2)To solve the problem of identity switch which often occurs when two players come close to each other,this dissertation proposes a 3D player tracking method based on the improved k-shortest paths algorithm and player similarity metric.Following a batch tracking pattern,the proposed method takes the probability occupancy map integrated by the player locations from multiple camera views as observations and then uses them to construct a spatio-temporal network flow graph.Finally,the player tracking problem can be converted into a problem of finding all the shortest paths in the network flow graph.In this work,the player identity cues(jersey color and jersey number)are introduced to adjust the edge weights of the above flow graph,and the problem of identity switch caused by players approaching is then well handled.Experiments indicate that,given about 70%of the jersey color and around 50%of the jersey number information,the 3D player tracking results of the proposed method can be significantly improved compared with the original method which neglects identity cues.(3)To deal with the problem of player tracking failure caused by huge posture change and similar appearance interference in complex sports scene,this dissertation proposes a 2D player tracking method based on the pose-aligned deep feature and graph convolutional neural network.On one hand,huge posture variation usually means that the individual feature of the player contains many background noises,and the pose-aligned individual player feature is thus introduced to get a more accurate feature representation.On the other hand,to handle the problem of similar appearance interference between players,a novel contextual relationship graph is then constructed and the graph convolutional neural network is adopted to integrate the adjacent player information.The proposed graph model can be used to learn a more robust similarity metric compared with the one directly calculated by individual features.Experiments show that,given the same player detections,the proposed method is obviously superior to other related methods in improving the 2D player tracking results.(4)On the basis of the above 2D player tracking,a 3D player tracking method based on cross-view association matching that can also achieve 3D pose estimation is further proposed.To overcome the influence of occlusion and camera calibration error when the cross-view matching is done within a single frame,a more robust multi-frame based cross-view geometric similarity measurement method is proposed.Meanwhile,to handle the interference caused by similar appearance between players,a novel appearance similarity metric based on a graph model is also proposed.The graph model is built with the players in each camera as the nodes,deep appearance features as node attributes,and the connections across views as edges.The across-view player similarity metric learned by graph convolutional network is more distinguishable compared with the original appearance similarity results calculated with a simple cosine distance.Experiments indicate that,considering both of the two proposed similarity metrics can significantly improve the effects of 3D player tracking and 3D pose estimation.
Keywords/Search Tags:Sports video analysis, Visual object detection, Visual object tracking, Single object tracking, Multiple object tracking, Multi-camera multi-object tracking, Pose estimation, Graph convolutional network
PDF Full Text Request
Related items