Font Size: a A A

A Research On Microbial Data Modeling And Disease Prediction Based On Ensemble Deep Learning

Posted on:2024-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y ShenFull Text:PDF
GTID:2544307127953709Subject:Software engineering
Abstract/Summary:PDF Full Text Request
More and more researches have shown that human microbiota plays a vital role in human health and can be a key factor in predicting some human diseases.As a computing framework,machine learning method has been widely used in human microbiota to discover more information.However,human clinical microbial data are often characterized by highdimensional,sparse and small samples,which poses a great challenge to the modeling of traditional machine learning methods.Therefore,how to effectively explore the potential relationship between human microorganisms and disease,and as a disease predictor to improve the prediction performance of the model is particularly important.The ensemble deep learning method integrates multiple neural networks into the algorithm to improve the robustness of the model while dealing with high-dimensional sparse data.This paper proposes an ensemble deep learning framework and demonstrates the superiority of the framework in metagenomic data modeling and clinical disease prediction from many angles.The main research work of this paper is as follows:(1)In order to solve the problems of high dimension,sparse and small sample size of metagenomic data,an ensemble deep learning modeling and disease prediction framework based on unsupervised learning and supervised learning is proposed.In this framework,the unsupervised deep learning method is used to learn the potential representation of samples,and the disease scoring strategy based on deep representation is used as the information feature of integrated analysis.In order to ensure the optimal integration,a score selection mechanism is constructed,and the score feature is combined with the original sample to form an enhanced combination feature.The gradient boosting classifier is used to model the combination feature for host disease diagnosis.As a case study,the ensemble deep learning framework was validated on six published human intestinal metagenome data.The experimental results show that,compared with the existing algorithms,the framework achieves better performance in the tasks of microbial data modeling and disease prediction.(2)Through meta-analysis,more microbial data of clinical patients and control groups can be obtained from existing studies.Under the condition of sufficient sample size,how to further improve the performance of deep learning model or ensemble learning deep model is still a critical challenge.As an ensemble method,the gradient boosting learner has been proved to have better performance than the general model in many scenarios.The gradient boosting learner is a supervised model.Compared with the unsupervised model,it can fully learn the relationship between features and labels under the condition of sufficient sample size.For this reason,considering the advantages of neural network and tree model,this paper uses the neural network to learn the structural knowledge in the tree model.Furthermore,to alleviate the dimensional disaster and data heterogeneity,a neural network ensemble strategy based on twostage stacked autoencoder is proposed.In the first stage,the autoencoder alleviates the dimensional disaster,and in the second stage,the influence of data heterogeneity is improved.Finally,a fully connected neural network is constructed on the processed data to predict the health status of the host.The research results on the published GMHI(Gut Microbiome Health Index)dataset show that this method has better disease diagnosis ability than the gradient boosting model and the ordinary neural network.To sum up,this paper designs and implements a disease prediction model based on metagenomic data by using ensemble deep learning method,and puts forward different solutions for different data situations.The effectiveness of the proposed method is verified by step-by-step tests,which shows that the work in this paper has certain academic value and practical application value.
Keywords/Search Tags:Metagenomics, Ensemble deep learning, Disease prediction, Gradient boosting model
PDF Full Text Request
Related items