Font Size: a A A

Research On The Improvement Of Sample Weighted Method In AdaBoost Algorithm

Posted on:2019-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:J ChengFull Text:PDF
GTID:2417330548470792Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The integrated learning algorithm is currently a very effective machine learning framework,in which the Boosting algorithm combines weak learners into strong learners through serialization,which can well fit the real models and solve practical problems.Based on the idea of Boosting,AdaBoost algorithm is proposed.The algorithm is not only high classification accuracy,but also easy to implement,so AdaBoost algorithm has been widely used in the field of pattern recognition and computer vision.However,through in-depth study,AdaBoost also found shortcomings.AdaBoost is prone to degradation in the face of complex samples,and the classification accuracy decreases.At the same time,it is difficult for AdaBoost algorithm to classify unbalanced data sets accurately.The classification error rate of small sample is very high,and the overall classification effect is poor.In this paper,for the above shortcomings of the algorithm,the sample weighting method is improved.First of all,in the face of the problem of degeneration,AdaBoost algorithm will increase the sample weight of the easily-interleaved cross-samples and noise samples,resulting in the degradation of the overall classification performance.To this end,this article first gives a threshold value.When the number of classification errors is greater than the threshold value,Increase the sample weight,in order to curb the tilt algorithm.Secondly,in the face of unbalanced datasets,larger sample weights are given when the small class samples are wrongly classified,and the weight of the classifier is not reduced even if the classification is correct to make the algorithm pay more attention to the small class samples in the classifying process so as to improve the overall AdaBoost algorithm Ability to categorize unequal data sets.Finally,a large number of experiments verify the stability and effectiveness of the improved AdaBoost.
Keywords/Search Tags:AdaBoost algorithm, degradation problem, non-equilibrium data set, sample weight
PDF Full Text Request
Related items