Font Size: a A A

Research On Detection Method Of Travel Time Outliers Based On License Plate Recognition Data

Posted on:2021-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y WangFull Text:PDF
GTID:2492306470486944Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The historical data of vehicle travel time contains the traffic characteristics of vehicles that can reappear in a specific time and space,which is of great value for the research and application of travel time prediction.The historical travel time data is generally obtained through the collection system.The data samples obtained by this method are mixed with outliers that cannot reflect the real traffic state,which will affect the accuracy and effectiveness of travel time prediction related research.Based on the license plate recognition data,this paper studies the inherent spatial-temporal distribution characteristics of travel time outliers,and realizes the detection of vehicle travel time outliers.The main work is as follows:(1)Aiming at the two kinds of dirty data,missing data and duplicate data,the corresponding data preprocessing algorithm is studied.(2)An outlier detection algorithm based on log-normal distribution mixture model is proposed.In this method,the travel time data samples accumulated in a certain period of time are grouped according to the discrete time period of one day.For each group of data,a log-normal distribution mixture model is used for clustering analysis from the perspective of density distribution.The right-long-tail feature of outlier distribution and R-square index are used to determine the optimal number of clusters dynamically,which solves the problem that the number of clusters is difficult to determine caused by the heterogeneity of travel time.Experiments show that,compared with the detection method based on distance clustering,the algorithm has better detection effect and can also obtain the threshold to distinguish the valid data and the outliers in each time period.(3)A method for calculating travel time anomaly threshold based on L2 constrained least squares is proposed.This method studies the intuitive phenomenon that the anomaly threshold obtained by the mixture model has a certain fluctuation with time and the anomaly threshold trends of different data samples on the same road have similarity in shape.The random sampling method is used to process the travel time data samples to construct the fluctuation interval of the abnormal threshold in each time period,and then the regression analysis method is used to obtain a fitting curve that can reflect the trend of the anomaly threshold;on this basis,the Pearson correlation coefficient is used to measure the similarity of the threshold fitting curve of different data samples,which verifies that the fitting curves of different samples have a high degree of similarity.Experiments show that the anomaly threshold fitting curve can determine the threshold to distinguish the valid data and the outliers at any time,so as to detect outliers.At the same time,the fitting curve has the function of identifying outliers in other data samples,which not only improves the efficiency of outlier detection,but also has great significance for data sample processing which can not detect outliers with log-normal distribution mixture model due to poor quality.
Keywords/Search Tags:Travel time, Outlier detection, Log-normal distribution mixture model, Density distribution, Cluster analysis, Anomaly threshold, Regression analysis
PDF Full Text Request
Related items