Multi-source Online Transfer Learning Method And Application For Imbalanced Data

Posted on:2023-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Zhou

Full Text:PDF

GTID:2568306794955109

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Multi-source online transfer learning uses the labeled data of multiple source domains to enhance the classification performance of the target domain,in which the target domain receives data online.By dynamically adjusting the weights of source domain and target domain,each domain can be used adaptively,so it has good generalization performance and high learning efficiency.However,in many real scenes,the data are often imbalanced,and a small number of samples are misclassified,which will bring significant losses.In order to effectively solve such practical problems,this paper proposes a new multi-source online transfer learning algorithm.The specific research work is as follows:This paper proposes a multi-source online transfer learning algorithm which can oversampling the samples in the target domain.The algorithm finds the k-nearest neighbor of the current batch of samples from the samples of the previous batch,first generates a small number of most class samples,and then generates a few class samples to balance the class distribution of the current batch of samples.Each batch of synthetic samples and real samples train the target domain function together,so as to improve the classification performance of the target domain function.At the same time,oversampling methods in the input space and feature space of the target domain are designed respectively,and comprehensive experiments are carried out on multiple real-world data sets to prove the effectiveness of the proposed algorithm.In order to deal with the imbalanced data in the source domain and the target domain,this paper further proposes a multi-source online transfer learning algorithm which can over sample in the feature space of the source domain and the target domain.The algorithm mainly includes two parts: oversampling multiple source domains and oversampling online target domains.In the oversampling stage of the source domain,oversampling in the feature space of the support vector machine classifier to generate a few class samples.The new samples are obtained by amplifying the original Gram matrix through the neighborhood information in the feature space of the source domain.In the online oversampling stage of the target domain,the samples of the target domain arrive in batches.The minority samples of the current batch find the k-nearest neighbor in the feature space from the previous batches,and use the generated new samples to train the target domain function together with the original samples in the current batch.The samples of source domain and target domain are mapped to the same feature space by kernel function for oversampling,and the corresponding decision function is trained by using the data of source domain and target domain with relatively balanced category distribution,so as to improve the overall performance of the algorithm.Comprehensive experiments were carried out on four real data sets.And a comprehensive experiment is carried out on four real data sets,which is better than the compared algorithm in accuracy and G-mean.

Keywords/Search Tags:

Multi-source transfer learning, Online learning, Imbalanced data, Feature space, k-nearest neighbor

PDF Full Text Request

Related items

1	Transfer Learning Across Heterogeneous Feature Spaces
2	Transfer Learning From Multiple Source Domains
3	Imbalanced Classification Methods For Complex Distribution Characteristics
4	Applications Research Of Multi-view And Transfer Learning Based Classification Methods
5	Research On The Visual Group K-Nearest Neighbor And Group Inverse K-Nearest Neighbor Query Of Multi-Source Objects In Three-Dimensional Space
6	Research And Application Of Time-series Data Prediction Based On Deep Learning
7	Research On Online Transfer Learning Algorithms
8	Research On Imbalanced Classification Problems In The Framework Of Transfer Learning
9	Random K-Nearest Neighbor Algorithm With Application To Bankruptcy Prediction
10	Research On Online Learning For Incomplete And Imbalanced Data Streams