Font Size: a A A

Research On The Prediction Method Of LncRNA And Disease Association For Heterogeneous Network

Posted on:2022-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhanFull Text:PDF
GTID:2514306614458384Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
The abnormal expression of long non-coding RNAs(lncRNAs)is usually associated with various human diseases,so prediction of disease-related lncRNAs is beneficial to elucidate the complex disease pathogenesis and provide the basis for the diagnosis and prevention of diseases.However,screening candidate lncRNAs for diseases by biological experiments has many disadvantages such as high experimental environment requirements and high time cost.Combining computer science techniques with bioinformatics to predict lncRNA-disease associations could reduce experimental costs.Therefore,studying the relationship between lncRNA and disease is a meaningful topic with extremely high application potential.In multi-source data related to lncRNAs and diseases,extracting rich semantic information for association prediction tasks is a critical but challenging task.There are also challenges in modeling the multi-level node pair attributes and node-neighbor topological relationships of lncRNA-disease pairs.To address these challenges,we build association prediction methods from three perspectives: the attribute level of a pair of lncRNA-disease pairs,the topological level of node neighbors,the attribute and topological attention mechanism.In this paper,based on the deep learning method,the study task of lncRNA and disease association prediction is carried out.The main work and contributions are reflected in the following three aspects:First,aiming at the study of multiple connection relationships between nodes,this topic builds the lncRNA-disease association prediction model DSCNLDP based on deep and shallow convolutional neural networks.Combining multiple biological premises that a lncRNA-disease pair is associated,similar or interacting with each lncRNA,disease and miRNA,this paper proposes a novel,node-pair-level attribute embedding mechanism,which establish attribute matrix of node pairs in multi-source data.The attribute representations of lncRNA-disease pairs are learned by a multi-layer convolutional neural network that fuses shallow detail features and deep representative features of node pairs.Finally,the fused attribute representations are fed into a fully connected neural network to reveal the associations between lncRNAs and diseases.To explore the contribution of deep representative attributes and shallow detail attributes to association prediction,we conduct ablation experiments.In addition,we design experiments to discuss the hyperparameters of the model,and selects the best set of experimental parameters.DSCNLDP is applied on public datasets.Multiple evaluation metrics and case studies on lung cancer,prostate cancer,and colon cancer show that the method proposed in this paper achieves good results.Second,aiming at the research of neighbor topology of lncRNA and disease nodes,this topic builds the lncRNA-disease associated prediction model GATLDP based on the graph attention network,which is built to express the topology in heterogeneous networks.First,in the lncRNA similarity network and disease similarity network,a new embedding mechanism is proposed from the neighbor node level to construct the topological feature embedding of a lncRNA(or disease)and its most similar multiple neighbors.Then,the framework constructed based on the graph attention network projects the node topology vectors to the low-latitude space to obtain denser node representations.A self-attention mechanism at the neighbor node level is built to learn the importance of each neighbor.We extend the neighbor attention mechanism to multi-head,which stabilizes the learning process of attention and captures neighbor topological information from multiple levels.Through experiments,we discuss the optimal configuration of hyperparameters when GATLDP achieves the best performance.The experimental results show that the GATLDP model achieves better performance in both AUC and AUPR.Case studies on three common cancers further demonstrate the predictive performance of GATLDP.Third,aiming at the research of node attributes and neighbor topology,this topic built a prediction model GTAN based on the dual attention mechanism of attribute and topology to infer the association propensity between lncRNA and disease nodes.The attention mechanism highlights the local information important to the task during the learning process of the neural network,and is effective for association prediction.An encoding module with an attribute-level attention mechanism is used to learn the attribute representation of node pairs.Attribute attention can effectively distinguish the differential contribution of the node to related attributes.After that,we build a topological encoding module with topological attention mechanism in the paper,which learns the contextual information of interdependence among multiple local topological representations.The two encoding modules are separately learned to obtain the association score,and finally a hyperparameter is used to measure the contribution of the two modules.GTAN outperforms 8 state-of-the-art prediction methods on AUC and AUPR.Furthermore,the improvement in recall of GTAN indicates that our model can retrieve more truly lncRNAdisease associations in the top-ranked list of prediction results.Case studies on lung,prostate and colon cancers further confirm the ability of GTAN to discover potential lncRNA-disease associations.
Keywords/Search Tags:LncRNA-disease association prediction, Multilayer convolutional neural networks, Graph neural networks, Neighbor-level self-attention mechanism, Attribute-level attention, Topology-level attention
PDF Full Text Request
Related items