Font Size: a A A

Method For Prediction Of LncRNA-disease Associations Based On Link Prediction

Posted on:2016-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhengFull Text:PDF
GTID:2310330488474133Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Increasing studies indicate that long non-coding RNA(lnc RNA) has important functions in many biological processes. Therefore, the mutation or dysfunction of these lnc RNAs will result in a number of complex diseases. Current research and data on lnc RNA-related diseases are relatively scarce, so predicting potential lnc RNA-disease associations by bioinformatics method is the trend in this field, which is of great significance for the exploration of pathogenesis as well as disease diagnosis, treatment, prognosis and prevention.At present, there are mainly two kinds of methods for predicting potential lnc RNA-disease associations. One is based on computational model, and the other one is based on network propagation. Computational model methods predict lnc RNA-disease associations either by integrating gene-disease association data and lnc RNA expression profile data, and performing enrichment analysis using hypergeometric distribution, or by introducing Gaussian kernel function to calculate similarity and constructing Laplace operator to resolve the optimal solution. These methods have a shortage of high model and computational complexity. Network-based propagation method calculates similarity between lnc RNAs using resource allocation algorithm, and then diffuses similarity information to the whole network by propagation algorithm. This method requires the calculation of n-th power of the adjacency matrix or iterative algorithms to make an approximation, which means a high computational complexity.Local structural link prediction has advantages of simple modeling, low complexity, high accuracy, and satisfies the biological hypothesis that similar diseases tend to be caused by similar or same lnc RNA. Inspired by this view, we introduce the idea of link prediction to lnc RNA-disease association prediction. But the link prediction is based on "triangle closing" model which relies on common neighbors, it can't be directly used in the heterogeneous bipartite network. To solve this problem, we propose the concept of "common neighbors" between nodes in different attributes set of a bipartite graph, and establish a "quadrilateral closing" model based on this concept. Using the "quadrilateral closing" model, we modify nine link prediction similarity indexes to make it suitable for bipartite network and apply them to lnc RNA-disease association prediction.We perform leave-one-out cross validation test on lnc RNA-disease bipartite network, and the BPA method reaches the highest AUC with value of 0.9377, which increases by nearly 19% compared to the previous methods whose highest AUC value is 0.7881. In addition, the BPA method has 14 edges that rank first in the all 19,000 prediction rankings, and 81 edges rank top 1% in the all prediction rankings. Case studies on glioma and lung cancer also show strong prediction ability of our algorithm.The results above show that our method for lnc RNA-disease association prediction has a high accuracy, and it is an improvement and supplement for existing methods. Moreover, we provide a new way to explore the lnc RNA-disease association prediction problem, which is based on local structural similarity link prediction view. This will help to simplify the problem modeling and reduce the computational complexity.
Keywords/Search Tags:long non-coding RNA, disease, association prediction, bipartite graph, link prediction
PDF Full Text Request
Related items