| In machine learning,a dataset with a significant difference in the number of samples between categories is considered an imbalanced dataset.In other words,the class imbalance occurs when the number of examples representing one class is much lower than the ones of the other classes.Hence,one or more classes may be underrepresented in the dataset.However,standard machine learning algorithms are usually biased toward the majority class,since rules correctly predicting those instances are positively weighted in favor of the accuracy metric or the corresponding cost function.As a consequence,minority class instances are more often misclassified than those from the majority one.One of the main issues in imbalanced problems is that usually,the underrepresented class is the class of interest of the problem from the application point of view.Therefore,there are a large number of researches on imbalanced learning.Compared with the binary class imbalanced problem,the multi-class imbalanced problem faces greater challenges,which is attributed to the diversity of class distribution and the insufficient performance of multi-class classifiers.Therefore,multi-class imbalanced problems have received more and more attention in recent years.This paper studies the problems of multi-class imbalanced,and carries out the following works:(1)An overview of multi-class imbalanced algorithms is presented.The current models and algorithms for multi-class imbalanced problems are classified,and the advantages and disadvantages of various methods are analyzed.Discuss the performance metrics applicable to imbalance problems and their respective evaluation biases.(2)A multi-class imbalance learning algorithm based on one-versus-one decomposition strategy and spectral clustering is proposed.For sampling algorithms that currently assume a uniform distribution of data,sampling based on spectral clustering proposed in this paper takes into account the distribution of the data and effectively avoids oversampling of outliers.And comparison and analysis with a variety of multi-class imbalanced algorithms(OVO-SMOTE,Improved A&O with SMOTE,OVO-UnderBagging,Multi-IM,DRCW_ASEG,OVO-EasyEnsemble,OVO-SMOTEBoost)confirm the effectiveness of the algorithm.(3)Based on the previous research,a visual evaluation tool for imbalanced learning algorithms is designed and implemented.The tool incorporates classic and rich imbalanced learning algorithms.In the tool,a graphical interface is used to select data sets,algorithms and display various performance metrics,which can simplify the process of imbalanced learning research and improve experimental efficiency. |