Font Size: a A A

Research On Learning Of Complex Features For Prediction Of LncRNA-Disease Associations

Posted on:2024-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:G S CaiFull Text:PDF
GTID:2530307139458424Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Long non-coding RNA(lnc RNA)is a type of non-coding RNA that is longer than 200 nt.lnc RNAs play key regulatory roles in a variety of biological processes,and their abnormal expression is closely related to the occurrence and development of many human diseases.Therefore,accurate prediction of the association between lnc RNAs and diseases can assist in understanding the pathogenesis of diseases,and provide guidance for the next experimental validation.However,traditional biological experiments are long-period,costly and inefficient.Therefore,based on the validated associations,using computational models to infer the potential associations between diseases and lnc RNA has become an important and effective auxiliary means.However,the existing computational models still have limitations in learning the complex features of heterogeneous biological networks.To address these problems,this paper proposes two new computational models for predicting lnc RNA-disease associations.The main contents are as follows:1.Lnc RNA-Disease Association prediction model based on Heterogeneous information Fusion Graph Convolutional Network(HFGCNLDA): Using graph convolutional network to mine and fuse multi-level interaction information among nodes to improve the model’s ability to learn complex features from heterogeneous networks.Firstly,a lnc RNA-disease-mi RNA heterogeneous network is constructed based on nodes similarity and known associations;Secondly,the GCN aggregator and heterogeneous feature fusion module are based to obtain comprehensive representations of nodes;Finally,the inner product decoder is used to reconstruct the lnc RNA-disease association probability matrix.In comparison experiments with five current state-of-the-art models,HFGCNLDA achieves the highest AUROC and AUPR values(0.9932 and 0.8997),confirming the effectiveness of fusing multi-source heterogeneous data and mining multi-level interaction information among heterogeneous network nodes.The results of real case analysis of lung cancer and colon cancer show that HFGCNLDA can predict potential lnc RNA-disease associations.2.Lnc RNA-Disease Association prediction model based on Semantic and Global dual Attention mechanism(SGALDA): Based on HFGCNLDA,meta-path and dual attention mechanism are introduced to integrate semantic and neighborhood information to further improve the model’s ability to learn complex features from heterogeneous networks.Firstly,a lnc RNA-disease-mi RNA heterogeneous network is constructed based on node similarity and known associations,and the local representations of nodes are obtained using GCN aggregation and feature fusion;Secondly,multiple semantic sub-networks are extracted from the heterogeneous network using meta-paths,and then graph convolutional networks are applied on each sub-network to learn the semantic representations of nodes;Thirdly,the dual attention mechanism is used to fuse the semantic and local representations to obtain a more comprehensive representation of the nodes;Finally,the inner product decoder is used to reconstruct the lnc RNA-disease association probability matrix.In comparison experiments with six current state-of-the-art models,SGALDA achieved the highest AUROC and AUPR values(0.9945 and 0.9167).The results of real-life case studies of lung cancer,colon cancer,breast cancer and gastric cancer confirmed the model has a good ability to infer potential lnc RNA-disease associations,proving that the model can be a reliable candidate as a predictive model.
Keywords/Search Tags:association prediction, heterogeneous network, meta-path, attention mechanism, Graph Convolutional Network(GCN)
PDF Full Text Request
Related items