Font Size: a A A

Research On Prediction Of Long Non-coding RNA And Protein Interactions Based On Matrix Factorization Methods

Posted on:2019-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:T Y ZhangFull Text:PDF
GTID:2370330551956844Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
LncRNAs participate in complex cellular processes and play important roles in biological process such as chromatin modification,splicing,transcription and gene expression.LncRNAs are also reported closely related to a range of diseases.Studies demonstrate that most of lncRNAs exert their functions by interacting with corresponding RNA binding proteins.Therefore,it is significant to develop researches focusing on identifying IncRNA-protein interactions.Approaches to identify IncRNA-protein interactions can be divided into experimental methods and computational methods.Experimental methods detect specific IncRNA and obtain accurate identification results.However,due to high cost of experimental methods to identify IncRNA-protein interactions,various efforts have been developed to discover IncRNA-protein associations based on computational approaches.In this work,the similarity matrix of IncRNA is constructed based on IncRNA expression profile,the similarity matrix of protein is constructed based on corresponding gene ontology information and IncRNA-protein interaction matrix is constructed based on IncRNA-protein interaction data.A novel approach named LPGNMF(predicting IncRNA-protein interaction using graph regularized nonnegative matrix factorization)is developed to discover unobserved IncRNA-protein interaction.LPGNMF maps the IncRNAs and proteins into latent feature space by matrix factorization.Predictions are generated for unobserved interactions depending on the inner products of IncRNA-protein feature-vector pair.Graph regularized terms of IncRNAs and proteins are adopted to guide the matrix factorization process and improve the prediction performance of the algorithm.The comparison results with other methods by using leave-one-out cross validation(LOOCV)indicate that LPGNMF can discover novel IncRNA-protein interactions efficiently.In order to solve the cold start problem,a novel algorithm called LPKBMF(predicting IncRNA-protein interaction using kermelized bayesian matrix factorization)is developed to deal with out-of-sample IncRNAs and out-of-sample proteins.In this algorithm,similarity matrices of IncRNA and protein are used as kernel matrices to integrate side information,and interactions between IncRNAs and proteins are calculated based on bayesian matrix factorization framework.The comparison results with other methods by using cross validation indicate that LPKBMF can effectively predict the interactions of out-of-sample IncRNAs and out-of-sample proteins.
Keywords/Search Tags:Long non-coding RNA, protein, interaction, matrix factorization
PDF Full Text Request
Related items