Font Size: a A A

Prediction Of LncRNA-miRNA Interaction And MiRNA-disease Association Based On Similarity

Posted on:2021-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:J CuiFull Text:PDF
GTID:2370330647461863Subject:Engineering
Abstract/Summary:PDF Full Text Request
Non-coding RNAs are a class of RNA that does not encode proteins,so for a long time,it was considered a "dark matter" on the genome.However,with the accumulation of research,the scientists found that only about 2% of the RNA-encoded proteins in the human genome and the remaining 98% of the RNA no longer continue to be translated into proteins,but functions in various biological processes as non-coding RNA.Mi RNA is a class of short sequence non-coding RNA.The research shows that the mutual regulation between lnc RNA and mi RNA may cause a variety of diseases,and lnc RNA and mi RNA themselves also participate in the process of disease.Therefore,if more lnc RNA-mi RNA interactions and mi RNA-disease associations can be discovered,the disease mechanism can be understood at the molecular level through mi RNA regulation behaviors to prevent disease occurrence.At present,some lnc RNA-mi RNA interaction data and disease-mi RNA association data have been obtained through biochemical experiments,but traditional experimental methods are often time-consuming and labor-intensive,and have no direction.Fortunately,in recent years,the emergence of the interdisciplinary bioinformatics has made machine learning methods shine in biological research.Many effective methods have been proposed for interaction prediction,and all have achieved excellent results.This paper has conducted in-depth research on the application of machine learning in the interaction prediction problem,and the main work is as follows:(1)For the prediction of lnc RNA-mi RNA interaction,a prediction model based on heterogeneous graph inference(SNFHGILMI)is proposed.First,we calculated multiple similarity data,including lnc RNA sequence similarity,mi RNA sequence similarity,lnc RNA Gaussian nuclear similarity,mi RNA Gaussian nuclear similarity.Second,the similarity network fusion method is used to integrate similarity data and get the similarity network of lnc RNA and mi RNA.Then,we constructed a bipartite network by combining the known interaction network and similarity network of lnc RNA and mi RNA.Finally,the heterogeneous graph inference method was introduced to construct a prediction model on the bipartite graph network to predict the potential lnc RNA-mi RNA interaction.(2)For the prediction of mi RNA-disease association,an ensemble learning prediction method based on multi-feature combination(MFXGBMDA)is proposed.First,calculate the mi RNA functional similarity,disease semantic similarity and their Gaussian interaction attribute nuclear similarity as the original features.Secondly,using principal component analysis(PCA)to reduce the feature dimension,obtain low-dimensional features from the original features and minimize information loss.Then,the disease-mi RNA pair features are constructed based on the disease feature information and mi RNA feature information.Finally,using multiple sets of disease-mi RNA pair sample features,multiple sub-classifiers are constructed based on XGBoost,and multiple sub-classifiers are integrated to construct an ensemble model to predict disease-mi RNA association probability.
Keywords/Search Tags:LncRNA-MiRNA interaction, Disease-miRNA association, Heterogeneous graph inference, XGBoost, Bioinformatic
PDF Full Text Request
Related items