Font Size: a A A

Prediction Of NcRNAs-drug Association And Protein Function Based On Deep Learning

Posted on:2023-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:H L XuFull Text:PDF
GTID:2544307070984369Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the main reason for the relatively poor productivity in R&D in the pharmaceutical industry lies in the difficulty in drug-target selection.Among approved drugs,more than 99%of them target at some specific proteins.Therefore,the focus of target selection has shifted to other macromolecules including non-coding RNA.The ability of non-coding RNA to control the expression of oncogenes and tumor suppressor genes makes it a potential target for cancer drug development.At the same time,protein is still the main goal of drug discovery,we can understand the characteristics of target protein deeply and help to speed up the process of drug discovery via studying the function of protein.With the rapid development of new generation sequencing technology and genomics,the sequence data of biological big data such as non-coding RNA and protein are growing exponentially.Analyzing and verifying these data through traditional biological experiments requires a lot of resources and low efficiency.At present,deep learning technology has been widely used in the field of bioinformatics and achieved great success because of its strong processing ability to biological big data and the ability to fit high-dimensional and sparse complex data.In this thesis,deep learning were used to build models to predict the association between non-coding RNA-drug sensitivity/resistance and the function of protein.The main contents of this paper are as follows:1.Aiming at the prediction of the association between mi RNA and drug sensitivity,a calculation method based on simplified graph convolution neural network named LGCMDS is proposed.In this model,two common designs in the standard graph convolution neural network model,feature transformation and nonlinear activation,are abandoned,and only the most essential component,neighborhood aggregation is retained,which reduces the training difficulty of the model.Combined with the high-order connectivity of mi RNA-drug bipartite graph,mi RNA-drug interaction information is effectively integrated into the embedding process.The result of five-fold cross validation shows that the AUC value of this method is 0.8872 and the AUPR value is 0.9026.Compared with the five most advanced models,LGCMDS has achieved the best results.In addition,case studies of two common drugs,cisplatin and doxorubicin,further demonstrate the effectiveness of LGCMDS in predicting potential mi RNA-drug sensitivity associations.2.In order to further study the association between non-coding RNA and drug resistance,an adjacency matrix is constructed using the known correlation data of non-coding RNA and drug resistance,and then a graph convolution neural network method based on linear residual is developed to predict the association between non-coding RNA and drug resistance without introducing or defining additional data.The method first aggregates the potential features of adjacent nodes in each graph convolution layer.Then the conversion of information between layers is realized by a linear function.Finally,the embedding of each layer is unified to complete the prediction.The experimental results show that this method has more reliable performance than the other seven most advanced methods,and the average AUC value reached 0.8987.Case studies demonstrate that this method is an effective tool for predicting the association between non-coding RNA and drug resistance.3.For the problem of protein function prediction,using the idea of natural language processing,an integrated protein function prediction model based on deep learning is constructed.The human protein sequence data is provided by CAFA.Firstly,the corresponding protein interaction and domain data are collected and vectorized.Then multihead-attention and stacked autoencoder are used to extract protein sequence and protein interaction network features,respectively,and convolution neural network was used to capture protein domain features.Finally,these features are integrated and a deep neural network classifier is used to predict protein function.In addition,the model is compared with the other two protein function prediction methods,and the results show that the model achieves the highest8(6)in the prediction of BP terms and CC terms,which proves the feasibility of the model in predicting protein function problems.
Keywords/Search Tags:Deep Learning, Graph Neural Network, Non-coding RNA, Drug Sensitivity, Drug Resistance, Protein Function Prediction
PDF Full Text Request
Related items