Multi-source online transfer learning uses the labeled data of multiple source domains to enhance the classification performance of the target domain,in which the target domain receives data online.By dynamically adjusting the weights of source domain and target domain,each domain can be used adaptively,so it has good generalization performance and high learning efficiency.However,in many real scenes,the data are often imbalanced,and a small number of samples are misclassified,which will bring significant losses.In order to effectively solve such practical problems,this paper proposes a new multi-source online transfer learning algorithm.The specific research work is as follows:This paper proposes a multi-source online transfer learning algorithm which can oversampling the samples in the target domain.The algorithm finds the k-nearest neighbor of the current batch of samples from the samples of the previous batch,first generates a small number of most class samples,and then generates a few class samples to balance the class distribution of the current batch of samples.Each batch of synthetic samples and real samples train the target domain function together,so as to improve the classification performance of the target domain function.At the same time,oversampling methods in the input space and feature space of the target domain are designed respectively,and comprehensive experiments are carried out on multiple real-world data sets to prove the effectiveness of the proposed algorithm.In order to deal with the imbalanced data in the source domain and the target domain,this paper further proposes a multi-source online transfer learning algorithm which can over sample in the feature space of the source domain and the target domain.The algorithm mainly includes two parts: oversampling multiple source domains and oversampling online target domains.In the oversampling stage of the source domain,oversampling in the feature space of the support vector machine classifier to generate a few class samples.The new samples are obtained by amplifying the original Gram matrix through the neighborhood information in the feature space of the source domain.In the online oversampling stage of the target domain,the samples of the target domain arrive in batches.The minority samples of the current batch find the k-nearest neighbor in the feature space from the previous batches,and use the generated new samples to train the target domain function together with the original samples in the current batch.The samples of source domain and target domain are mapped to the same feature space by kernel function for oversampling,and the corresponding decision function is trained by using the data of source domain and target domain with relatively balanced category distribution,so as to improve the overall performance of the algorithm.Comprehensive experiments were carried out on four real data sets.And a comprehensive experiment is carried out on four real data sets,which is better than the compared algorithm in accuracy and G-mean. |