Font Size: a A A

Research And Implementation Of Key Technologies Of Disease Auxiliary Diagnosis Based On Ensemble Learning

Posted on:2020-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:M YeFull Text:PDF
GTID:2404330575471649Subject:Engineering
Abstract/Summary:PDF Full Text Request
Traditional disease diagnosis is limited by the Doctor's experience and the external environment,which may have not only low efficiency,but also relatively high rate of misdiagnosis.However,establishing a disease diagnosis model that combines machine learning techniques with medical experience can overcome the shortcomings of traditional diagnostic methods at present.Diagnostic model that ensemble multiple machine learning classifiers is likely to have higher accuracy than the model with only single classifier.A new algorithm named CELEDAT is proposed in this thesis.This algorithm ensembles several classifiers including K-Nearest Neighbor,Decision Tree,Support Vector Machine,and Logistic Regression,meanwhile has the idea of dynamic weights and threshold decision based on data imbalance ratio.The main work is described as following:The method of dynamic weights effectively enhances the adaptability of the datasets and the base classifiers,which gives each test sample different weights by computing similarity between samples and the cluster by clustering the test samples and the weights between the cluster and corresponding base classifiers.The strategy of threshold decision based on data imbalance ratio and adjusting decision thresholds according to the number of samples in different class is adopted.It improves prediction accuracy for the minority class.Several datasets related to disease diagnosis from the UCI Machine Learning Repository are employed to demonstrate the effectiveness of CELEDAT algorithm.Several measures are used to evaluate the performance of CELEDAT algorithm including G-mean,AUC,F1-score,Overall Accuracy,sensitivity and specificity.Experimental results show that the method can significantly improve the performance of the traditional integrated classifier and increase the ability of classifying imbalanced data.In addition,visualized results about the classifier performance on each dataset are displayed in this paper,which more intuitively compares the pros and cons of each algorithm model.In addition,the CELEDAT algorithm is applied to disease diagnosis on real clinical data,and it improves the accuracy and efficiency of disease diagnosis to some extent.
Keywords/Search Tags:ensemble learning, dynamic weights, disease diagnosis, imbalanced data
PDF Full Text Request
Related items