Font Size: a A A

Kolmogorov-Smirnov Optimization And Variable Selection Based On A Surrogate Loss Function

Posted on:2024-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:X F LinFull Text:PDF
GTID:2530307067991489Subject:Statistics
Abstract/Summary:PDF Full Text Request
Kolmogorov-Smirnov(KS)statistic is quite popular in many areas such as credit scoring,customer relationship management and so on to evaluate the binary classifica-tion performance due to its explicit business intension.Fang and Chen(Computational Statistics and Data Analysis,2019,133,180-194)proposed a novel DMKS method that directly maximizes the KS statistic and compares favorably with the popular existing methods in terms of KS.However,DMKS did not consider the critical problem of vari-ance estimate and variable selection since the special form of KS brings great challenge to establish the DMKS estimator’s asymptotic distribution.In practice,including re-dundant covariates into the score function will seriously impair the prediction power and cause confusion among model interpretability.Meanwhile,the lack of variance estimate will bring some problems to statistical inference.In this paper,we handle the intractable issue that KS is neither continuous nor smooth by introducing a surrogate loss function which leads to a consistent estimator for the true parameter up to a multiplicative non-zero scalar.Therefore,we can select variable by using this estimator.Then the nonconcave penalty SCAD is combined to achieve the variable selection consistency and asymptotical normality with the Oracle property.Results of simulation studies and real data analyses confirm the theoretical results.The proposed method not only shows great advantages on KS index compared to the original DMKS method without variable selection,but also can estimate the vari-ances of the estimates.
Keywords/Search Tags:Binary classification, Credit scoring, SCAD, Oracle property, Variable selection consistency
PDF Full Text Request
Related items