Font Size: a A A

Analysis And Prediction Of Translation Rate Based On The Characteristics Of Sequence In Yeast Genome

Posted on:2021-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiangFull Text:PDF
GTID:2370330614960648Subject:Statistics
Abstract/Summary:PDF Full Text Request
It is pointed out that the central dogma of life information from DNA to RNA to protein is the basic principle for human to understand life.During this period,the process of from RNA to protein translation becomes the core of the central dogma.Translation regulation plays a key role in the regulation of gene expression.In recent years,with the rapid development of ribosomal footprints and other translation related measurement techniques,the qualitative and quantitative study of translation process has become more and more in-depth,becoming a new hotspot in molecular biology.Previous studies have shown that the regulation of translation plays an important role in gene expression,and the change of translation rate has an important influence on the abundance,structure and function of proteins.Therefore,the study of translation process will be helpful for the further analysis of gene expression regulation mechanism at the translation level.The quantitative study of the central dogma will be the key step to get through the gene information flow from transcripts to functional proteins,and also help to give a new understanding of the relationship between translation regulation and protein folding.The sequence factors that affect the translation efficiency and translation elongation rate on Saccharomyces cerevisiae genome were analyzed statistically.Meanwhile,on the basis of correlation analysis,some factors that determine the translation efficiency and translation elongation rate were extracted.Based on the means of bioinformatics,the theoretical prediction of the translation efficiency and translation elongation rate on Saccharomyces cerevisiae genome was realized.The main conclusions are as follows:1)Translation efficiency was positively correlated with protein abundance,codon adaptability index,fraction of optimal codons,t RNA adaptability index,partial codon content and amino acid content with negative charge,but negatively correlated with sequence length,partial codon content,polarity and positive charge amino acids content;2)The dependence of translation elongation rate and translation efficiency on the content of codon and amino acid was basically the same,but their dependence on the physicochemical properties of protein was quite different.The protein isoelectric point and instability index were negatively correlated with the elongation rate,but not with the translation efficiency.The hydrophobicity score was weakly negatively correlated with the translation efficiency,but weakly positively correlated with the elongation rate;3)As translation efficiency was significantly related to length,the prediction of translation efficiency was significantly better than that of translation elongation rates.On the premise of combining the experimental errors,the prediction results of the real value of translation efficiency based on the support vector regression model can explain the experimental results to the extent of 70?94%;4)The prediction accuracy of translation elongation rate was only about 30%,which may be due to many reasons.At present,it was impossible to directly measure translation elongation rate in experiments.Therefore,translation elongation rate can only be estimated according to the experimental data of ribosome density and protein synthesis rate.However,how to estimate translation elongation rate was still controversial.This was the main reason for our poor accuracy of prediction for translation elongation rates.After drawing the above conclusions,each gene is further subdivided into translation initiation regions and translation extension regions,and then each translation region is further subdivided into high ribosome density segments and low ribosome density segments.The ribosome high-density and low-density sequences of the translation initiation region and translation extension region were analyzed respectively.Based on the analysis results,the support vector machine was used to qualitatively classify and predict the high ribosome density segment and the low ribosome density segment in the translation extension.The main conclusions are as follows:5)By studying the usage of codons in the translation initiation and extension regions,it was found that the codon GAA was always used more frequently,while the three rare codons?CGG,CGC,CGA?were used the least frequently.However,the difference between these codons in the high ribosome density section and the low ribosome density section is not significant.6)In the translation initiation region,lysine and glutamic acid in charged amino acids are used more frequently in the high ribosome density segment;Among the hydrophobic amino acids,leucine,isoleucine and alanine are used more frequently in the high ribosome density section;In the translation extension region,among the charged amino acids,lysine,glutamic acid and aspartic acid are used more frequently in the high ribosome density segment.Among the hydrophobic amino acids,leucine,isoleucine and alanine are used more frequently in the high ribosome density segment.7)Whether it is the translation initiation region or the translation extensionregion,the third base G/C,the local codon adaptability index,and the local t RNA adaptability index of the codons in the high ribosome density segment and the low ribosome density segment are significantly different.8)Through the prediction of the high ribosome density segment and the low ribosome density segment of the translation extension.The results show that when the critical value r0of the relative ribosome read number is 40,the prediction accuracy is as high as 76.8%,and the classification effect is significant.
Keywords/Search Tags:Saccharomyces cerevisiaegenome, translation efficiency, translation elongation rate, codon, amino acid
PDF Full Text Request
Related items