Font Size: a A A

Research On Method Of Active Learning Based On Coding

Posted on:2020-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:S L GuFull Text:PDF
GTID:2517306548990599Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,with the explosive growth of data,how to improve the performance of learning with a small amount of labeled data and a large amount of unlabeled data has become a hot topic in the field of machine learning.One of the most important methods is active learning.Active learning selects the most valuable data for manual labeling in order to build a superior classifier with minimal labeling cost.For data,labels and features are the most important components,so active learning algorithms mainly design selection criteria from these two aspects.However,due to the multi-class and high-dimensionality of real-world data,most active learning algorithms have certain deficiencies in dealing with multi-class data and high-dimensional data.In view of the above problems,this paper proposes two active learning methods based on coding.The main work is as follows:(1)Active learning method based on One-vs-All label encoding: Aiming at the problem that it is difficult to accurately measure the uncertainty of multi-class data from multiple perspectives,by combining the error correction output codes framework,the original problem is transformed into multiple sub-problems and effective uncertainty measurement criteria are designed from multiple perspectives,so that the algorithm can accurately select the most valuable samples for labeling.Finally,the effectiveness of the algorithm is verified by multiple experiments.(2)Active learning method based on maximum margin feature coding:Aiming at the problem of inaccurate prediction of active learning models with high running time and storage space cost in high-dimensional data,by encoding high dimensional features,active learning and low-dimensional representation of data are simultaneously optimized under a unified framework to solve the problem of high-dimensional features of data in active learning and select the most representative samples in low-dimensional space for labeling.Finally,the effectiveness of the algorithm is verified by experiments.
Keywords/Search Tags:Active learning, High dimensional data, Feature extraction, Uncertainty sampling
PDF Full Text Request
Related items