| The development of big data and mobile communication fields has promoted the progress of smart transportation,and more and more scholars have joined the research work on trajectory data mining.The upgrade of mobile hardware facilities provides a large amount of spatio-temporal trajectory data for trajectory data mining,but the quality analysis of these data is often neglected,and the results are often unreliable if they are hurriedly invested in specific experimental research.Data needs to undergo scientific analysis and evaluation,targeted data cleaning,and stripping away the illusion that the data is intact in order to make subsequent research more meaningful.This article analyzes the problems in the quality of trajectory data,which are mainly divided into three categories,namely duplicate data,noisy data and missing data.Establish specific models and algorithms for data cleaning or repair for each problem,solve the current quality problems of vehicle trajectory data sets,and establish a set of general vehicle trajectory data evaluation models(Vehicle trajectory Data Evaluation Model,referred to as VDEM).The main research work of this thesis is as follows:(1)Aiming at the low efficiency of the existing nearest neighbor sorting algorithm(SNM)for repeated data detection,a repeated data cleaning model based on SNMW is proposed,that is,the repeated data is divided into levels,combined with the extracted spatiotemporal trajectory features and the data level to dynamically adjust the window The size,through the weighted similarity measurement,solves the shortcomings of the fixed window size in the SNM algorithm,and improves the efficiency of repeated data detection.(2)Aiming at the current lack of research on noise cleaning of vehicle trajectory data sets,a noise data cleaning model based on the dual filter joint algorithm(Kalman filter-Savitzky-Golay,K-S-G for short)is proposed.Fusion of the Kalman filter algorithm and the Savitzky-Golay smoothing filter algorithm in spectroscopic imaging,to fit the original vehicle trajectory,and the fitted trajectory can filter out the trajectory noise burrs that are difficult to find in the data set.Through comparison experiments It is verified that the K-S-G algorithm has the best fitting effect.(3)Aiming at the problem of complementing the missing data of the vehicle trajectory when the road network is unknown,a road interpolation data complement model based on the S-G smoothing filter algorithm is proposed.This model simulates the vehicle trajectory fitted by the Savitzky-Golay smoothing filter algorithm into a road,combines the laws of kinematics to establish a mathematical model and then adjusts the data,which improves the accuracy of data completion.(4)In order to comprehensively evaluate the accuracy and completeness of vehicle trajectory data and form an effective trajectory data quality evaluation system,a set of general quality evaluation model VDEM for vehicle trajectory data sets has been established,and the evaluation is set for different trajectory data rules.Indicators,the data before and after cleaning are evaluated separately,verifying the effectiveness of the data cleaning model. |