Font Size: a A A

Research On ROC Convex Hull Maximization Algorithm Based On Evolutionary Multi-objective Optimization

Posted on:2021-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q ZhangFull Text:PDF
GTID:2370330620965630Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data classification is one of the most fundamental research directions in the field of machine learning.As a basic data processing method,dichotomy has been widely used in intelligent data processing in real life.The traditional dichotomy method usually assumes that the distribution of data categories is balanced and the misclassification cost of each category is equal,but in many practical problems,a few classes have higher misclassification cost.When the traditional classification algorithm is used to deal with unbalanced data,due to the imbalance between the number of most classes and the number of minority classes,the optimization goal of maximizing the overall classification accuracy will make the classification model favor the majority classes and ignore the minority classes,resulting in the low classification accuracy of minority classes.On this basis,a considerable number of noise data samples will appear in the real data set,among which the tag noise is the most common,which will seriously affect the classifier training.As a meta-heuristic algorithm with good parallelism and strong global search ability,evolutionary algorithm is very suitable for training classification model.Based on this,from the perspective of the evolutionary multi-objective optimization algorithm,this thesis proposes the maximum Receiver Operating Characteristics convex hull algorithm based on dynamic reference points and the design and implementation of the robust classifier based on multi-objective optimization.The main work and results of this thesis are summarized as follows:(1)ROCCH(receiver operating characteristics conference hull)is a commonly used classifier performance analysis technology,which is particularly effective for solving the task of unbalanced data distribution.The maximization of ROCCH performance is a double objective optimization problem,which has been solved by some multi-objective optimization algorithms(MOEAs).However,the existing MOEAs will encounter some difficulties in obtaining ROCCH,because ROCCH is always convex,while the frontier obtained by Pareto domination of MOEAs is concave.In this thesis,we propose an evolutionary multi-objective algorithm based on dynamic reference points to maximize ROCCH which is based on the distance from the solution to the reference point rather than on Pareto's dominant relationship,so we can get a real rocch rather than Pareto front.In addition,in order to obtain better convergence rate,the reference point will move adaptively during the algorithm iteration.Experimental results show that the algorithm can get better experimental results than the mainstream MOEAs in maximizing ROCCH.(2)The classification of machine learning needs a lot of support of labeled data,but the actual data often has unknown scale of noise markers,which will directly affect the final result of the classifier.In order to obtain a better classifier in the data set with noise samples,In this thesis,a robust maximum ROC convex hull algorithm is proposed.In the iterative process of the algorithm,firstly,a part of pure training subset is obtained by clustering method,and a group of population is trained in the pure training subset,then a group of population is trained in the original training set containing noise,and the center points of the two populations are calculated,then the direction vectors of the center points of the two populations are found,and the noise rate is approximated to the population disturbance step length In the original noisy training set,the population is disturbed to the center of the pure subset.The experimental results show that this method can effectively improve the training effect of the classifier,make the classifier have noise resistance,and make the classifier have certain robustness.In addition,this method has certain universality,and can embed most of the mainstream evolutionary algorithms to maximize ROCCH.
Keywords/Search Tags:Unbalanced Data Classification, ROC, Evolutionary Algorithm, ROCCH, Label Noise, Reference Point, Multi-Objective Optimization, Robustness
PDF Full Text Request
Related items