| Bus timing data refers to a series of values collected by sensors to measure the state of the bus body,which are continuous and consistent in time.The correctness of bus time series data is of great value to the research of various application technologies in the bus.The method of using threshold to judge abnormal data can only detect bus time series data that are not in the upper and lower bounds,but can not detect bus time series data that do not conform to the change rules.Therefore,this thesis studies a method to detect abnormal bus time series data The detection method is very important.According to the characteristics of bus time series data,this thesis proposes an anomaly detection algorithm based on LSTM for the features with obvious historical laws,and proposes an anomaly detection algorithm based on isolated forest and lof fusion for the data with strong correlation between features.The specific research work is as follows:(1)This thesis analyzes the bus time series data,summarizes the abnormal types in the bus time series data,and divides the bus time series data into two types by Pearson correlation coefficient: one is that some features have obvious historical laws with time;the other is that there is a strong correlation between features.(2)For the characteristics of historical laws in bus time series data,an anomaly detection algorithm based on LSTM is proposed.The LSTM algorithm is used to learn the existing historical laws,and then a group of sequences composed of predicted values can be obtained.By making the difference between the predicted value and the actual value and taking the absolute value,a group of difference sequences can be obtained.After sorting,the outliers can be judged by the threshold If the value is higher than the threshold,it is considered as an outlier.Compared with the traditional RNN and one class SVM algorithm,the proposed algorithm has better performance in accuracy,F1 and recall.(3)For the features with strong correlation in bus time series data,the correlation degree is determined by Pearson correlation coefficient to form a correlation feature group.Based on the correlation features,an anomaly detection algorithm based on isolated forest and lof fusion is proposed.Firstly,the isolated forest is used to detect the abnormal data of all bus time series data sets.The data sets are divided into correct and abnormal according to the detection results,and then the lof algorithm is used to filter the correct data sets.The detection algorithm combined with isolated forest and DBSCAN is tested on data sets,and the results show that the proposed algorithm has better performance in accuracy,F1 and recall. |