Research On Classification Algorithms And Storage Models For Concept Drift And Imbalanced Data Streams

Posted on:2024-08-09

Degree:Master

Type:Thesis

Country:China

Candidate:C H Tang

Full Text:PDF

GTID:2558307115998789

Subject:Electronic Information (Computer Technology) (Professional Degree)

Abstract/Summary:

PDF Full Text Request

With the continuous development of modern technology,different fields in production and life have constantly generated data streams,such as traffic monitoring,sensor data,network traffic detection,etc.How to extract valuable information from data streams has attracted widespread attention.Data streams are characterized by real-time,high speed,and massiveness,many algorithms designed for static datasets are usually not applicable in the data stream environment.Moreover,data stream can cause data imbalance and concept drift due to its dynamic and unstable nature.Some traditional data stream algorithms for imbalanced classification and concept drift detection overlook the problem of memory usage in algorithm design and require a large amount of memory space to store historical samples.This problem may become a bottleneck that constrains the performance of algorithms due to the limitation of computer hardware.Therefore,this paper proposes two memory-friendly storage and classification algorithms for data stream class imbalance and concept drift problems.The main contributions are as follows:(1)A storage and classification method for imbalanced data streams by ensemble OS-ELM was proposed.A fixed-size matrix is used to store the feature information of historical minority samples,and only a small and constant amount of memory space is required to store the feature matrix.At the same time,the algorithm improves its classification performance by performing random undersampling without replacement on the majority class and using ensemble methods to construct the classification model.In the experiments,the algorithm was compared with several mainstream algorithms on some datasets.The algorithm requires only 0.8906 KB of additional memory space on all datasets.The effectiveness of this algorithm is demonstrated through theoretical analysis and experimental results.(2)A storage and classification algorithm for concept drift data streams based on Online Sequential Extreme Learning Machine(OS-ELM)was proposed.Firstly,the algorithm is initialized by calculating a feature matrix based on the samples between the warning level and the drift level.This feature matrix stores the feature information of this portion of the samples.When needed,the classifier is retrained by accessing the feature matrix saved at the corresponding time point.While improving the classification accuracy,the algorithm requires only a small and constant amount of memory space to store the historical data feature information.The algorithm can solve the memory usage problem caused by the continuous storage of samples in DDM and its derivative algorithms.The comparative experimental results on both artificial and real-world data streams confirm the effectiveness of the proposed algorithm,and the amount of memory space required is significantly reduced.

Keywords/Search Tags:

Data stream, Class imbalance, Concept drift detection, OS-ELM, Storage

PDF Full Text Request

Related items

1	Research On Concept Drift Detection In Data Stream And Classification Algorithms For Imbalanced Data Stream
2	Research On Ensemble Classification Algorithms Of Data Stream Based On Concept Drift
3	Research On Classification Algorithms For Imbalanced Data Stream With Concept Drift
4	Research On Data Stream Classification Method Based On Active Learning And Micro-clusterin
5	Research On Data Stream Classification Method Based On Concept Drift Detection
6	Research On Online Active Learning Algorithms For Multiclass Imbalance Data Stream
7	Research On Classification Algorithms For Data Streams With Concept Drift
8	Research On Classification For Data Streams With Concept Drift
9	The Research On Frequent Pattern Mining Algorithm Of Data Stream Based On Concept Drift Detection
10	Research On Concept Drift Detection And Ensemble Classifier Based On Data Stream