Research On LncRNA-disease Association Prediction Based On Bayesian Generative Adversarial Networks

Posted on:2024-01-29

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhong

Full Text:PDF

GTID:2544307121483854

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Numerous studies have shown that long non-coding RNA(lncRNA)is closely related to biological life activities,and is involved in gene expression,cellular value-added and genetic regulation,and the occurrence of many diseases.Therefore,lncRNA-disease association prediction can help people to obtain relevant biological information,understand the pathogenesis,and better diagnose and prevent diseases.Most of the currently known lncRNA-disease association pairs come from biological experimental validation,although biological experimental validation is the most authoritative method for association prediction studies,but it inevitably entails high experimental costs and consumes a lot of human resources,so based on the existing lncRNA biological information,numerous computational methods have emerged to mine the potential lncRNA-disease association,but due to the limited association data,the accuracy of many methods is limited to achieve better results,and the biggest characteristic of lncRNA-disease association data is that there are only positive samples,which is not friendly to many fully supervised models.In the case of only positive samples and very limited number of positive samples,many computational methods are affected.Based on the above problems,this paper tries to investigate two lncRNA-disease association prediction models: one is the lncRNA-disease association prediction model LDAF＿GAN based on association filtering generative adversarial network,and the other is the lncRNA-disease association prediction model incorporating variational Bayesian inference on the basis of LDAF＿GAN The LDAF＿GAN model is composed of a generator and a discriminator,but differs from the traditional GAN by the addition of The overall LDAF＿GAN model consists of a generator and a discriminator,but differs from the traditional GAN by adding a filtering operation and negative sampling.The filtering is to let the output of the generator be point multiplied with the real data before inputting to the discriminator,so that the results generated by the model only focus on the part that has been associated(i.e.,focus on the part of the association matrix that is 1),while the negative sampling is to sample some negative samples from the data with unknown association(taking out some negative samples from the part of the association matrix that is 0 as the assumed unassociated negative samples),and by adding a regular term to the loss function So that the model not only requires the generated positive samples to be close to 1,but also requires the generated negative samples to be close to 0,avoiding the model to generate all-1 results but well cheating the discriminator,so as to make the model achieve a better fitting effect.At the same time,in order to improve the generalization performance of the model and its effect on small sample datasets,the LDAF＿VGAN model is obtained by incorporating Bayesian inference on the basis of LDAF＿GAN,so that the parameters of the generative adversarial network are changed from a single value to a distribution,and the mean value is sampled from the distribution to select the network parameters,thus adding uncertainty to the model and improving the generalization performance of the model.In model evaluation experiments,the model LDAF＿GAN achieves better prediction results compared with Bi GAN,CNNLDA,NBCLDA,TILDA,and LDAP.the AUC values of LDAF＿GAN on two publicly available datasets with five-fold cross-validation are 0.976 and 0.914,respectively.in the case study,the LDAF＿GAN model has a better prediction result for six lncRNAs H19,MALAT1,XIST,ZFAS1,UCA1 and ZEB1-AS1 respectively,giving top ten predictions for disease association,with lncRNA H19 and UCA1 reaching 100% association prediction in data proven by biological experiments.Likewise,the LDAF＿VGAN model achieved relatively excellent results on the two publicly available datasets with a five-fold cross-validation AUC value of 0.981 and 0.898,respectively.In addition,the experimental results of LDAF＿VGAN outperformed LDAF＿GAN on the small sample dataset and the new dataset,from which the experimental results showed the improvement of Bayesian inference on the generalization performance of the model and the advantage on small sample datasets.The experimental results show that the two models in this study are able to mine potential lncRNA-disease associations on lncRNA data with known associations and lncRNA data without known associations but with sequences,and achieve excellent prediction results.

Keywords/Search Tags:

lncRNA-disease association prediction, generative adversarial network, lncRNA sequence characterization, filtering unknown associations, variational Bayesian inference

PDF Full Text Request

Related items

1	A Research On Prediction Method Of Disease Related-lncRNA Based On Probability Model
2	Research On LncRNA-disease Association Prediction Method Based On Heterogeneous Network Mining
3	Study On Disease-related LncRNA Association Prediction Based On Multi-layer Heterogeneous Networ
4	Prediction Of LncRNA-disease Association Based On Manifold-regularized Matrix Factorization
5	Research On LncRNA-disease Association Prediction Method Based On Biological Networ
6	Algorithms For Identifying Potential Associations Between Non-coding RNA And Disease/Drug Sensitivity
7	Research On LncRNA-disease Association Prediction Based On Biomolecular Association Network
8	Prediction Of LncRNA-disease Association Based On Internetwork Random Walk Algorithm
9	Generative Adversarial Networks For Heart Disease Prediction
10	Research On MRI Reconstruction Based On Generative Adversarial Network