Font Size: a A A

Research On The Detection Of Concept Drift In Time Series Based On Manifold Space

Posted on:2024-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:S S WangFull Text:PDF
GTID:2530307058477824Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information technology and the improvement of technical capabilities such as data acquisition and storage,large-scale time series data have been preserved,providing the basic conditions for data analysis and mining.However,real-life time series data is affected by the phenomenon of the concept drift due to changes in the internal and external environment,which means that the statistical distribution of the data changes over time,and this phenomenon is widely found in various real-world application scenarios.The phenomenon of the concept drift in time series is derived from the non-stationarity in the series data,which results in the degradation or even failure of the performance of machine learning models,thus reducing the generalizability of the models.Therefore,how to detect the concept drift in time series data timely and accurately has become a hot research topic in the field of machine learning.In this paper,we focus on the concept drift detection problem in time series,and propose two different types of methods to detect concept drift in time series in manifold space.The main research work of this paper is as follows:1)To solve the concept drift detection problem in time series,this paper proposes a Riemann distance-based drift detection method for time series data.The method starts by training a prediction model using pre-processed data,which is used to make continuous predictions on the time-series data.Then,the statistical process control method is used to track the prediction error distribution in the manifold space as the distance changes to determine the occurrence of the concept drift.The results of the experiments show that the method is effective in detecting the concept drift on both synthetic and real data sets.2)In order to solve the problem that drift detection method for model-based prediction errors degrades the detection performance due to overfitting,a method is proposed to identify the concept drift by using the change of distribution of time series statistical features in the manifold space.A feature covariance matrix is chosen as the reference of a known distribution,and the evolution of the distance from the feature covariance matrix to the reference feature covariance matrix in the manifold space is monitored to achieve fast and accurate determination of the occurrence of concept drift.Through the experiments with manual and real data,it is verified that the method can improve the concept drift detection performance compared with existing methods.This work introduces information geometry into the machine learning model and proposes two types of detection methods based on the prediction model error and based on data distribution by monitoring the distribution changes of data features in the manifold space using techniques such as Riemannian metric and statistical process control,and achieves better results.This research can provide new ideas for the problem of detecting the concept drift in time series.
Keywords/Search Tags:Time Series, Concept Drift, Manifold Space, Feature Covariance Matrix, Statistical Process Control
PDF Full Text Request
Related items