Research On Application Of Imbalanced Medical Data Based On Balanced Sampling And Deep Learning

Posted on:2019-08-04

Degree:Master

Type:Thesis

Country:China

Candidate:Y W Zhao

Full Text:PDF

GTID:2404330593951099

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the development of science and technology,our country is in the high-speed development stage of medical data informatization.With the exponential growth of medical data and the development of medical management system,the access to medical data is becoming more and more convenient.but the effective information hidden in massive medical data has not been fully tapped and effectively utilized.Medical data include physical examination data,electronic medical records,diagnostic imaging,and medical data.At present,medical diagnosis mainly relies on doctors’ professional knowledge and rich clinical experiences.It is of a practical significance to how to dig out hidden useful information from medical data,so as to provide assistance for doctor’s treatment decision.Aiming at the problem of imbalanced classification in medical data and the problem of disease modeling,this paper uses the knowledge of data mining to establish prediction model for medical data,and provides reference for doctors to diagnose the disease.At the data level,a new imbalanced data processing algorithm called KE-SMOTE is proposed to solve the problem of class imbalance in medical data.For majority class data set,KE-SMOTE uses K-Means repeatedly until the minimum error of clustering is no longer smaller or a specified number of iterations is reached,then we get the results of multiple clustering,finally we use clustering ensemble method to carry out under sampling.For minority class data set,KE-SMOTE uses over sampling method based on smote algorithm.According to combination of the new majority class samples and the new minority class samples,we get a new training data set.Experiments using UCI data sets show that the proposed algorithm has better performance than the traditional class imbalance processing algorithm.At the algorithm level,A deep belief network based on autoencoder called AE-DBN is proposed.AE-DBN uses autoencoder to extract features from the data set,and uses deep belief network to establish the model.By adjusting the number of hidden layer and each layer node,the optimal deep belief network model is constructed.In this paper,the medical data of hyperuricemia provided by hospital were used to carry out the experiment.It was proved that the algorithm proposed in this paper had higher classification accuracy compared with the traditional machine learning algo-rithm.At the same time,according to data analysis of features in dataset and modeling experiments with combination of different features,we got the influence factors of hyperuricemia,which can provide reference for doctors to diagnose hyperuricemia.

Keywords/Search Tags:

Medical Data, Imbalanced Dataset, Clustering Ensemble, Autoencoder, Deep Belief Network

PDF Full Text Request

Related items

1	Research On Intelligent Diagnosis Based On Imbalanced Electronic Medical Record Dataset
2	Design And Implementation Of Deep Clustering System For Brain Functional Connectivity Data
3	Classification Techniques For Imbalanced Data And Applications In Intelligent Medical Decision Support
4	Research On Clustering Ensembles Based Classification Method For Imbalanced Data Sets And Its Application
5	Research On Feature Selection And Classification For Medical Imbalanced Data
6	Application Of The Deep Learning In The Diagnosis Of Heart Disease
7	Deep Reinforcement Learning Classification Prediction Model And Its Application In Stroke Risk Prediction
8	Research On Prediction Algorithm Of Thrombosis Risk Based On Imbalanced Data
9	Research On Medical Insurance Anomaly Detection Based On Deep Learning
10	Research On Diabetes Prediction Based On Clustering Undersampling Hybrid Ensemble Algorithm