| Long non-coding RNA(lncRNA)is a kind of non-coding protein RNA with biological activity.A large amount of evidence has shown that lncrnas are closely related to a variety of human diseases.In particular,lncrnas can be regarded as potential biomarkers for clinical diagnosis and treatment of cancer.At present,thousands of lncrnas have been discovered,but only hundreds of them have clear biological function,and the association between lncrnas and diseases(lncRNA-disease Association(LDA)has been verified by experiments.Through biological experiment LDA high cost,long cycle,so using bioinformatics methods to predict the LDA is of great significance.The existing predictive LDA methods still have some problems.First,due to the sparsity of data and the fact that traditional methods only focus on the local topology of the graph to identify potential LDA,the model performance will be limited.Second,in the prediction task,the model only pays attention to lncRNA and disease,and there is a problem that the data source is single and other relevant biological information is not included.In order to solve these problems,this paper mainly does the following research:For problem 1,a prediction model of LDA based on deep learning and knowledge graph(CMLD-LDA)is proposed.By collecting multiple relationship data between circular RNA(circRNA),micro RNA(mi RNA),lncRNA and diseases,this model constructs knowledge maps to alleviate the problem of data sparsity.On this basis,the graph attention network is adopted.The embedding of knowledge map was obtained by assigning weights to different neighbor nodes,so as to obtain higher-order interaction information.The correlation score between lncRNA and disease was estimated by using multi-layer perceptron based on the embedding vector of lncr NA and disease.Experimental results show that CMLD-LDA is superior to the current cutting-edge methods.In order to further improve the performance of model CMLD-LDA,a prediction model of LDA based on multi-biomolecular network(LMPDN-LDA)is proposed for problem 2.It integrates a variety of biological data into a comprehensive association network,and uses K-mer feature representation and Me SH descriptor to obtain the attribute characteristics of nodes.The structural features of nodes in the association network were extracted by graph embedding model,and the potential LDA was predicted by random forest classifier.The experimental results show that the effect of LMPDN-LDA is significantly improved compared with the CMLD-LDA model,and it is better than the existing cutting-edge methods. |