| With the continuous development of modern technology,different fields in production and life have constantly generated data streams,such as traffic monitoring,sensor data,network traffic detection,etc.How to extract valuable information from data streams has attracted widespread attention.Data streams are characterized by real-time,high speed,and massiveness,many algorithms designed for static datasets are usually not applicable in the data stream environment.Moreover,data stream can cause data imbalance and concept drift due to its dynamic and unstable nature.Some traditional data stream algorithms for imbalanced classification and concept drift detection overlook the problem of memory usage in algorithm design and require a large amount of memory space to store historical samples.This problem may become a bottleneck that constrains the performance of algorithms due to the limitation of computer hardware.Therefore,this paper proposes two memory-friendly storage and classification algorithms for data stream class imbalance and concept drift problems.The main contributions are as follows:(1)A storage and classification method for imbalanced data streams by ensemble OS-ELM was proposed.A fixed-size matrix is used to store the feature information of historical minority samples,and only a small and constant amount of memory space is required to store the feature matrix.At the same time,the algorithm improves its classification performance by performing random undersampling without replacement on the majority class and using ensemble methods to construct the classification model.In the experiments,the algorithm was compared with several mainstream algorithms on some datasets.The algorithm requires only 0.8906 KB of additional memory space on all datasets.The effectiveness of this algorithm is demonstrated through theoretical analysis and experimental results.(2)A storage and classification algorithm for concept drift data streams based on Online Sequential Extreme Learning Machine(OS-ELM)was proposed.Firstly,the algorithm is initialized by calculating a feature matrix based on the samples between the warning level and the drift level.This feature matrix stores the feature information of this portion of the samples.When needed,the classifier is retrained by accessing the feature matrix saved at the corresponding time point.While improving the classification accuracy,the algorithm requires only a small and constant amount of memory space to store the historical data feature information.The algorithm can solve the memory usage problem caused by the continuous storage of samples in DDM and its derivative algorithms.The comparative experimental results on both artificial and real-world data streams confirm the effectiveness of the proposed algorithm,and the amount of memory space required is significantly reduced. |