Font Size: a A A

Learning Augmented Graph Random Walk Based Bioinformatics Entities Association Prediction

Posted on:2022-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:C X NingFull Text:PDF
GTID:2480306335458354Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Bioinformatics entity association prediction is a crucial a part of bioinformatics analysis,which might facilitate individuals to explore advanced diseases at the molecular level,as well as,improve the diagnosing,prediction and hindrance of diseases effectively.So as to,it's of nice significance to the event of human medication.Although,the method of exploring the association between entities in biological experiment has the advantage of high accuracy,which have the disadvantage of high cost and slow speed.In recent years,computational methods have provided a new idea for predicting the correlation between biological information entities.Based on the existing databases of bioinformatics,we can predict the correlation of bioinformatics entities quickly and efficiently by developing relevant calculation methods.Based on the data of biological informatics entities,this paper proposes a correlation prediction method of biological informatics entities based on learning augmented graph random walk(LAGRW-BEA),so as to improve the accuracy of correlation prediction of two types of biological informatics entities.This method is essentially a supervised restart random walk method for transition probability matrix optimization.In this method,two known types of biological informatics entities are used to establish a similar network through supervised learning.Then the network is mapped into matrix form and the association matrix and similarity matrices are obtained.We need to join and normalize the association matrix and the learned similarity matrices to build the transition probability matrix of two types of biological information entities.Finally,the restarted random tour is performed by combining the restarted probability distribution supervised learning algorithm with the method of negative example sampling,and the correlation scores between the specified class A entities and each class B entity are calculated.Plenty of research have indicated that non-coding RNAs in bioinformational entities area unit crucial for cell biological processes like human differentiation,response,and cell cycle management.In this paper,the association prediction of two types of noncoding RNA and disease,namely lnc RNA and disease,and mi RNA and bioinformatics entity of disease,were used as research cases to verify the effectiveness of the LAGRWBEA method.We use a variety of validation methods on multi-class data sets so as to prove the method's effectiveness.The AUC for LOOCV on the three kinds of lnc RNA datasets is 0.9145,0.9328 and 0.9534;0.9277,0.9151,0.9302;0.9522 respectively,which is 1%-6.57% higher than the traditional restart random walk method.On the four kinds of mi RNA datasets,the AUC of the leave-one-out cross-validation was 0.9533,0.9320,0.9500,0.9365,respectively,and the AUC of the 5-fold cross-validation was 0.9822,0.9817,0.9676,0.9807,all of that achieved satisfactory results.Among them,the fifth data was improved by up to 9.49% in the five-fold cross-validation method over the traditional method.In short,the LAGRW-BEA method has the highest prediction accuracy than which use the same dataset published in recent years.In addition,we also conducted case studies on colon cancer,lung cancer and gastric cancer and the prediction accuracy of both colon cancer and lung cancer reached 96%,and that of gastric cancer reached 82%,which further demonstrate the effectiveness of the experiment.In conclusion,the method presented in this article performs better than some other excellent methods by comparing experiments on different data sets.
Keywords/Search Tags:Bioinformatics entity, Random walk with restart, Transition probability matrix, Supervised learning
PDF Full Text Request
Related items