Font Size: a A A

The Protein Prediction Of Folding Structure Based On The Content Of Pyrimidine Nucleotides In The MRNA

Posted on:2013-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:M Z FuFull Text:PDF
GTID:2230330371987817Subject:Biochemical Engineering
Abstract/Summary:PDF Full Text Request
In this paper, it was found that there were significant differences in thecontents of two types of RNA in the different protein secondary structures of themature mRNA by the nucleotide statistics of mRNA. Especially in the foldedstructure, the contents of two types of RNA were opposite contrast to helix andcoil structure. In the folded structure, the amount of pyrimidine bases were morethan purine bases in ribonucleic acids while the amount of pyrimidine bases wereless than purine bases in the helix and coil structure. Based on this discovery, itwas combined with the tendentious factors of the Chou-Fasman rules. Fordifferent species, the Chou-Fasman propensity factors of synonymous codon ofamino acids for different secondary structure were optimized. And the conferredpreference parameters for the16different base pairs were also studied. Then, themRNA model of different length of amino acids was created. Every codon andbase pair was recognized by traversing the mRNA model. And differentstatistical rules were designed by assuming the different contribution of basepairs and codon.Finally, it was found that the tendency of folded structure washigher than other structures when the amount of pyrimidine base was more thanpurine base pairs in the coding sequence.By the contrary, the tendency of foldedstructure was lower than other structures. The theory of protein secondarystructure prediction algorithm was design based on this principle. And it wasmore accuracy than Chou-Fasman method for forecasting folding structure ofprotein.Due to the particularity of the statistical object, the currently existingsoftware could not achieve the requirements of data mining and analysiscapabilities. Consequently, the software was designed according to therequirements of this subject. The biological data of the subject was entirelyrooted in the CSAndS Database (coding sequence and structure database). The software could be used to handle the source data and provide the information ofcoding sequence and secondary structure of species and protein.Through the bases statistics of mRNA and the length statistical model ofamino acids, this subject was an effective supplement of secondary structureprediction by primary sequence and it was important to explore the biologicalmeaning of the genetic data.
Keywords/Search Tags:bioinformatics, nucleic acid, base, protein, secondary structureprediction, preference factor
PDF Full Text Request
Related items