Font Size: a A A

Prediction Of LncRNA And Major Human Disease Associations Via Multi-network Fusion

Posted on:2022-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:M Y ChenFull Text:PDF
GTID:2480306491455054Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Long non-coding RNA(lncRNA)has been the focus of recent oncogenomics studies,referring in particular to a diverse group of ribonucleic acid molecules.Long non-coding RNA can participate in a variety of biological processes,and the functions of lncRNA are mainly concentrated in two aspects: assembling protein complex and competing for binding with other non-coding RNAs.Based on the critical functions of lncRNAs and related pathological experiments,mutations,and dysregulation of lncRNAs can be confirmed to be strongly associated with many kinds of major human diseases.Therefore,the prediction of lncRNAdisease associations is promising for the diagnosis and treatment of diseases.To date,numerous computational solutions have been applied to lncRNA-disease association prediction.Some methods have difficulties in integrating differentiated multiple data sources,and use splicing and fusion of feature vectors to predict the association between lncRNA and disease,they often ignoring potential associations and noise between different data sources,making feature fusion inadequate and unbalanced,and finally leading to feature vector dimensionality explosion,model training overfitting or underfitting.Moreover,its accuracy needs to be improved during model training,and the tuning of key hyperparameters has low efficiency and so on.This paper proposes Bi-Aero LDA and Aero LDA,the adaptive multi-network fusion learning prediction framework,to predict the association between lncRNA and disease.Firstly,we use disease Gaussian semantic similarity calculation,lncRNA Gaussian functional similarity calculation and gene Jaccard similarity calculation to fuse the information of multiple data sources and construct disease similarity network,lncRNA similarity network,disease similarity network associated with gene fusion,lncRNA similarity network associated with gene fusion,respectively,and uses random walk with restart algorithm to obtain the diffusion state of each nodes in sub-network respectively;Next,this paper adopts diffusion component analysis combined with singular value decomposition for feature fusion and dimensionality reduction of the diffusion states;At last,this paper inputs the fused network features into the extreme gradient boosting model for iterative training,and also uses the particle swarm algorithm to globally optimize the training the key hyperparameters of extreme gradient boosting,in order to construct the Bi-Aero LDA and Aero LDA prediction frameworks.In the model evaluation experiments section,our methods not only provide better performance compared to several classes of methods,but also achieve superior results in the tasks of predicting associations of known diseases with lncRNAs and predicting new diseases.In addition to this,to further verify the prediction performance of our two methods,the case studies of major human diseases such as breast cancer,ovarian cancer,prostate cancer and colorectal cancer are selected for analysis and exploration to validate our methods' effectiveness.
Keywords/Search Tags:Disease, lncRNA, Multiple Network Fusion, Extreme Gradient Boosting, Particle Swarm Algorithm
PDF Full Text Request
Related items