| The arrival of big data era not only brings more information to various fields,but also brings many challenges.Classification helps to extract useful information from vast amounts of data.In the field of classification,a challenge is how to select appropriate evaluation indexes to compare the performance of different classifiers.In practical application,because of the category imbalance of a large number of datasets,the selection of evaluation indexes is more important.Imbalanced datasets will not only affect the classification effect of the classifier,but also cause some commonly used classifier performance metrics can not truly reflect the performance of the classifier.This paper selects four indexes which are widely used in imbalanced datasets:GM(Geometric Mean),F1 score,MCC(Matthew Correlation Coefficients),AUC(Area Under ROC Curve),and four new indexes RCI(Relative Classifier Information),MCEN(Modified Confusion Entropy),CBA(Class Balance Accuracy),IAM(Imbalance Accuracy Metric).Firstly,some of the proposed indexes are analyzed theoretically by using a set of constraints proposed by Mullick and Datta et al.Secondly,the CBA and IAM indexes which do not meet the constraints are improved to get MCBA(Modified Class Balance Accuracy)and MIAM(Modified Imbalance Accuracy Metric)by introducing the number of samples in each category.Finally,the indexes are statistically analyzed from the aspects of consistency and discriminancy.Through the experiment,the indexes are further compared,especially the MCBA and MIAM indexes are compared with other indexes.The result of theoretical analysis is consistent with that of case analysis,which proves that this group of constraints can provide reliable results in evaluating the applicability of indexes,and this evaluation method can be further extended.The experimental results of statistical comparison show that in the binary classificaton evaluation,AUC is the most discriminant.Although MCBA and MIAM are lower than AUC in discrimination,their performance is the best on multi-classimbalance classification problems.Generally speaking,both MCBA and MIAM can be used as general indexes to evaluate the performance of classifiers. |