| Crowd video analysis is an important research topic in the field of intelligent video surveil-lance.It involves in many applications,such as,public security,city and traffic plan,public space management.With the rapid process of urbanization,the people density of a city is in-creasing.Therefore,a lot of collective behaviors are frequently seen in our daily life.The rapid development of our society has the requirements of effective crowd event detection and high-level city management.To address these issues,computer vision technologies can be used for analyzing crowd videos,understanding crowd behaviors,and facilitating the emergence re-sponse of public security.In this paper,we propose different crowd motion descriptors based on low-level motion characteristics for different structured crowd scenes by leveraging the tech-niques from computer vision,motion dynamics,and vector field.Based on the proposed crowd motion descriptors,we further use statistical analysis,machine learning,and learning to rank to identify and understand crowd behaviors.Our work provides valuable contributions in both theory and application for public security management and intelligent video surveillance.Firstly,for structured crowd scenes,we propose a trajectory-based motion structure cod-ing algorithm to extract crowd motion descriptors.Unlike existing crowd motion features,our method uses local motion dynamics(curl and divergence)and global motion structures(tan-gential and radial trajectories)simultaneously.By using motion integration,we obtain CDT(Curl and Divergence of motion Trajectories)motion descriptors.Moreover,we show that the proposed motion features are scale-and rotation-invariant.In contrast,existing crowd motion features do not have this favorable property.To the best of our knowledge,this is the first attempt to exploit curl and divergence to represent low-level motion characteristics and to use both tan-gential and radial trajectories to represent motion structures.Motion integration is the essential step in this algorithm.Based on it,we can obtain global motion features which are independent to scale and orientation.Secondly,we propose a system framework for crowd motion segmentation and behavior recognition.In order to segment crowd motion patterns and handle the case that the overlap region among multiple motion patterns,we propose a method including motion clustering and motion decomposition.Then,we use the trajectory-based motion structure coding algorithm to extract CDT motion descriptors.Finally,SVM classifiers are trained and used to recognize crowd behaviors.To the best of our knowledge,this is the first attempt to raise the problem of motion decomposition and propose an approach to solve it.In sum,based on the proposed motion segmentation and effective motion representation,we achieve better performance on crowd behavior classification when compared with related methods.Thirdly,we propose a hand-drawn motion sketch based crowd video retrieval algorithm.To facilitate users to retrieve crowd videos with desired motion patterns,we propose to utilize hand-drawn motion sketches as queries,which help people freely express what they want.This kind of retrieval task from motion sketches to crowd videos is not seen in existing studies.The key issue of hand-drawn motion sketch based crowd video retrieval is a matching problem be-tween heterogeneous data.To address this problem,we propose to transform motion sketches to vector fields,and then use the trajectory-based motion structure coding algorithm to extract CDT features.In this fashion,we can map motion sketches and crowd videos into a common feature space.Regarding the matching issue,we proposed a multi-distance fusion strategy co-operated with Ranking SVM to learn a ranking model,which is used to sort motion patterns in crowd videos.Our retrieve algorithm performs favorably against related methods.Finally,we propose bilinear CD(Curl and Divergence)feature to represent crowd motions in unstructured scenes,and apply it to crowd video classification and retrieval.Specifically,we first compute the curl and divergence maps from a normalized average motion vector field.Then,we use a parameterized Sigmoid function to obtain the curl activation map and the di-vergence activation map.The bilinear CD vector is computed as the product of the patches cropped from the two activation maps.By using a sliding window,we can get massive local bilinear CD vectors.In order to represent different crowd videos in a unified fashion,fisher vector pooling and PCA algorithms are employed respectively for feature coding and dimen-sion reduction.Extensive experimental results show that the proposed bilinear CD feature can advance the performance of crowd video classification and retrieval by a noticeable margin.In sum,this thesis focuses on crowd video analysis and present some novel algorithms based on low-level crowd motion characteristics(curl and divergence).For structured crowd scenes,we explore the spatial structure information of motion patterns and propose CDT de-scriptors which are applied in crowd behavior recognition and hand-drawn motion sketch based crowd video retrieval.For unstructured crowd scenes,we propose bilinear CD features,and construct video descriptors which are used for crowd video classification and retrieval.From simple and structured scenes to complex and unstructured scenes,from crowd motion represen-tation to crowd behavior analysis and applications,we conduct a systematic research and make contributions in both theory and application. |