Font Size: a A A

Research And Application Of Statistical Learning Methods In RNAi

Posted on:2021-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:G LiuFull Text:PDF
GTID:2510306494491144Subject:Software engineering
Abstract/Summary:PDF Full Text Request
RNA interference(RNAi)technology provides a feasible strategy for sequence-specific silencing of disease-causing genes and has broad application prospects.And research has found that small interfering RNA(siRNA)is an effector molecule that triggers RNAi,and siRNA as a potential therapeutic drug needs to solve two problems.The first problem is to estimate the amount of escape of siRNA,which guides the design of reagents that can effectively introduce siRNA into the cytoplasm of corresponding target cells at a clinical dose.Another problem is how to design an efficient siRNA to make the target gene silencing effect best.In response to these two problems,this article has conducted the following research.For the problem of siRNA escape amount estimation,this article first uses the Approximate Bayesian Computation Rejection(ABC-REJ)to estimate the siRNA escape amount.Then on the basis of ABC-REJ,an approximate Bayesian method based on pseudo-prior was proposed,and the algorithm was used to estimate the amount of siRNA escape.Finally,through comparison with ABC-REJ,Approximate Bayesian Computation method based on Markov chain Monte Carlo,Approximate Bayesian Computation method based on sequence Monte Carlo,it is found that the pseudo-prior based approximate Bayesian method guarantees the accuracy of the estimation results.Greatly improve the efficiency of sampling.For the problem of designing high-efficiency siRNA,this paper takes the characteristics of the siRNA sequence as input,and uses the stochastic gradient boosting regression tree(SGBRT)model to predict the silencing efficiency of siRNA sequences.The Pearson correlation coefficient between the predicted value and the true value obtained by applying this model is 0.743,which is higher than the prediction models such as Biopredsi,DSIR,CNN,and SVR.Then the relatively important features are selected by calculating the relative importance of each feature.And we analyze how these features affect the silencing efficiency through the dataset to summarize some common characteristics of high-efficiency siRNA.
Keywords/Search Tags:RNA interference, The amount of siRNA escape, Approximate Bayesian Computation, SiRNA design, Stochastic gradient boosting regression tree
PDF Full Text Request
Related items