Font Size: a A A

Anomaly Detection Method For Taxi Trajectory Data

Posted on:2021-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:J JiangFull Text:PDF
GTID:2518306479464994Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Taxi can be regarded as sensors in the urban traffic network.Large scale GPS data contains hidden information and provides us with opportunity to discover knowledge.As an important branch of trajectory data mining,taxi trajectory anomaly detection is divided into two categories: detection on GPS trajectory anomalies and detection on urban traffic anomalies.The former aims to find a small number of trajectories that are significantly different from most trajectories,and the purpose of the latter is to detect anomalous traffic events represented by a set of trajectories,such as vehicle accidents,traffic jams,large gatherings,etc.In this thesis,a novel approach is proposed to achieve trajectory anomaly detection.This approach transforms trajectory sequences into pixel images,and realizes anomaly detection by destination prediction.Firstly,Parameterized Minimum Description Length(PMDL)is introduced to obtain optimal trajectory partitions.Secondly,Pixel Representation of Trajectories(PRT)algorithm is proposed,which converts trajectory sequences into pixel images to obtain more spatial details.Then,the important feature parts extracted from trajectory images are fed into a convolutional neural network(CNN)for feature extraction and destination prediction.Finally,we can judge whether a trajectory is abnormal according to the distance between the predicted destination and real destination.Extensive experiments are conducted on two real trajectory datasets,and the experimental results show that the approach has a good anomaly detection effect.Considering the complex form of traffic event anomaly and the inconspicuous anomaly of traffic flow in time and space,a spatial-temporal rarity-based method for urban anomaly detection is proposed.Firstly,the city area is divided into several grids,and the inflow and outflow of each grid in different time periods are calculated.Then,the concept of regional flow rarity is introduced to model the temporal and spatial rarity.Finally,two kinds of rarities are fused,and isolation forest model is constructed to detect anomalies.The approach achieves a hit rate of 85% on the real dataset,which is significantly improved compared with the existing algorithms.Significant differences in the number of normal and abnormal samples will cause data imbalance problem.Imbalanced data makes the model tend to learn more features of normal sample,but ignore abnormal samples which are less but more important.This will affect the accuracy of the detection.An imbalance problem solving framework is designed,which is combines three aspects of resampling,data feature and algorithm respectively.Resampling of anomaly data is first used,and then local features and global features are used to judge the category of samples.2PT is applied to optimize the model at last.The framework is applied to both trajectory anomaly detection and traffic anomaly detection.Experiments show that the framework can effectively solve data imbalance problem in anomaly detection,so as to improve the accuracy of detection.Besides,the framework proposed can be extended to solve imbalance problem in other related fields.
Keywords/Search Tags:Taxi trajectory data, Anomaly detection, Destination prediction, Spatial-temporal rarity, Data imbalance
PDF Full Text Request
Related items