Research On Undersampling With Iteratively Boosting For Imbalanced Medical Data

Posted on:2021-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:T Y Guo

Full Text:PDF

GTID:2504306107453084

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of big data technology and the acceleration of medical and health information construction,medical big data has gradually become a research hotspot.More and more research is using machine learning methods to medical data analysis to reduce the burden on doctors.In the medical field of,there is always a problem of sample imbalance,which leads to the unsatisfactory prediction effect of the model.Undersampling is a common method to solve the problem of sample imbalance.However,the existing undersampling technology cannot guarantee the precision and recall rate at the same time.Therefore,it is of great theoretical and practical significance to design a classification method for medical imbalanced samples.A new ensemble classification method based on undersampling with iteratively boosting(USIB)is proposed for imbalanced medical samples.First,this method undersamples majority class samples to construct a group of base classifiers.Then the base classifier is integrated into the classifier output of the last iteration to form a new set of ensemble classifiers.Then select the ensemble classifier with the best classification effect and compare it with the last output classifier.If the improved effect reaches the threshold value,stop the iteration.Otherwise,according to the classification effect of the output classifier on majority class samples this time,the sampling probability will be updated to enter the next iteration.Through the targeted sampling of majority class samples,the precision of prediction of the model is improved while the recall rate is guaranteed.The validity of the algorithm is verified on two imbalanced medical datasets.The experimental results show that the USIB method is superior to the existing undersampling algorithm in terms of multiple indexes on the highly imbalanced medical datasets.Compared with the other six undersampling methods,the F1 value of USIB method increased by 9.77% and 11.69% respectively compared with the best-performing method,and the AUC increased by 13.88% and 1.51% respectively compared with the best-performing method.

Keywords/Search Tags:

Medical Prediction Model, Imbalanced Samples, Undersampling Methods, Machine Learning

PDF Full Text Request

Related items

1	Research And Application Of Machine Learning Methods For Insufficient Training Samples In Medical Scenes
2	Research On Machine Learning Prediction Model For Imbalanced And Irregular Medical Data
3	Machine-learning-based Prediction Of Stroke Risk Among Middle-aged And Elderly Chinese With Imbalanced Data
4	Classifying Recurrence Rate For Patients With DLBCL Using Imbalanced Data And Machine Learning Methods
5	Research On Application Of Imbalanced Learning Technology In Medical Data
6	Prediction Of CYP450 Inhibitor Based On Machine Learning And Imbalanced Data Sampling Techniques
7	A Study Of Two Machine Learning Methods To Construct A Risk Prediction Model For Nonalcoholic Fatty Liver Disease
8	Research On Optimization Of Hypertension Risk Prediction Based On Machine Learning
9	The Rare Diseases-based Prediction Model With Small-sample And Imbalanced-data
10	Study Of GBM Prognosis Prediction Methods Based On Multi-modal Machine Learning