Improvement Of Confidence Set Method And Simulation Studies In Binary Classification

Posted on:2022-04-06

Degree:Master

Type:Thesis

Country:China

Candidate:Z D Meng

Full Text:PDF

GTID:2517306491960249

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

As one of the basic problems of statistical learning research,classification is widely used in various fields.The classical machine learning classifiers would learn based on training samples and calssify a sample to a single class whose real category is unkon-wn.In general,kinds of classifiers couldn't guarantee the accuracy of the classification,and may bear a high risk of error.This article mainly studies on confidence set method proposed by Liu in 2019.The advantage of this method can be understood like this:based on one observed training data set,one constructs confidence sets to predict catalogies of all test samples and guarantees that at least 1-?proportion of these confidence sets do contain the true catalogies.And we are ?*100%confident with respect to the randomness in the training data set that the claim is correct.This paper pays attention to the situation of binary classification,focuses on the difference of two types of confidence set classifiers and compares the confidence set methods with classical machine learning methods under balanced samples and unbalanced samples.In the study of balanced samples with different rates(?1,?2),we find that conservative confidence set method has a better coverage level for the true categories of test samples than exact confidence sets,while exact confidence set method has a better single classification level.The critical value of the conservative confidence set may explode with the increase of the sample imbalance coefficient in unbalanced samples,which will cause the single classification level of the conservative confidence set to drop significantly.The exact confidence set method will not only maintain a relatively stable critical value,but also ensure a stable coverage level and single classification level.This paper carries out simulation experiments on the confidence set method and the single classification method(classical machine learning classifiers)under balanced sample and unbalanced sample.The confidence set method has a more stable and high coverage level than the single-class classifier,whether it is for the overall sample or each category(advantageous sample,disadvantaged sample).It shows that compared to single-class classifiers,the confidence set method is a higher coverage and more stable classification method.Finally,this paper presents a secondary classification method based on the confidence method.This method can obtain a larger single classification level at the expense of a smaller coverage level.

Keywords/Search Tags:

confidence set, unbalanced samples, sample imbalance coefficient, disadvantaged sample, secondary classification method

PDF Full Text Request

Related items

1	Classification Optimization Method For Unbalanced Sample
2	Difference Research Of Solving Math Problem By Constructing G.P. After Learning Single And Double Content Samples Material
3	Inverse Distance Weighted Support Vector Machine On High-Dimension Low-Sample Size Data And Class-Imbalance Data
4	Empirical Study On Credit Risk Of Listed Manufacturing Companies Based On Unbalanced Samples
5	A Linear Separable Support Vector Machine For Large Samples
6	High Level Of Understanding Of Statistical Concepts And Methods Of Research
7	Research On The Improvement Of Sample Weighted Method In AdaBoost Algorithm
8	Classification Method Research Of Text Primary And Secondary School Teaching Resources Based On KNN Algorithm
9	Fisher And Small-sample And "n-dimensional Geometry Method"
10	Research Of Dynamic Optimization Of Test Difficulty Coefficient In Item Bank