| Network traffic time series data,one of the most common data in computer and communication networks,generally varies in time and thus is often presented as time series.These data are driven by human behavioral activities and usually exhibit data characteristics similar to human behavioral patterns.To exploit the business value of large-scale network traffic data,we need to use data mining techniques to refine the useful information of the data.Time series anomaly detection is a key technique in the field of data mining,and its main objective is to accurately identify data points that do not match the behavioral changes of time series.At present,many models related to time series anomaly detection have been proposed in academia and industry,but the existing anomaly detection algorithms generally have the following problems:(1)models rely on hypothetical distributions;(2)model training lacks real data labels;(3)models are sensitive to parameter adjustment and have high algorithm complexity.To address the above problems,this paper focuses on large-scale network traffic time series data and proposes anomaly detection algorithm based on network traffic time series(AD-NTTS)and change point detection algorithm based on density change of binary sequence(CPDDCOBS),respectively.Both algorithms avoid the assumption of data distribution and the use of data labels,and have better accuracy and robustness.In this paper,we first propose an anomaly detection algorithm based on network traffic time series,namely,the AD-NTTS algorithm.The algorithm designs a process-oriented and modular anomaly detection framework based on the data characteristics of network traffic time series.The framework consists of three important components,which are data preprocessing component,time series classification component,and anomaly detection component.Specifically,the AD-NTTS algorithm first preprocesses the input time series using the data preprocessing component to unify the series and at the same time improve the accuracy of the subsequent algorithm classification.Then,the time series classification component is used to quickly identify the sequence categories,so that the sequences classified in the same category have similar data characteristics.Finally,the corresponding anomaly detection algorithm is selected for outlier detection according to the class of sequences.The AD-NTTS algorithm effectively avoids the situation that different pattern time series use the same model for anomaly detection,which causes difficulty in training the model and poor performance of the algorithm,and has a high detection accuracy.In order to further improve the accuracy and robustness of time series change point detection,this paper also proposes a change point detection method based on density change of binary sequence,i.e.,the CPD-DCOBS algorithm.The algorithm determines whether the data mechanism before and after the sequence has changed from a novel perspective.Specifically,the CPD-DCOBS algorithm first converts the original time series into a binary sequence with 0 and 1,and then determines whether the data has changed by the density difference of the window value 1 before and after the binary sequence.The algorithm analyzes the sequence data from the binary dimension,which reduces the influence of noise and outliers on the model and has better robustness.At the same time,the algorithm does not require complex modeling of the data and has high detection efficiency. |