Font Size: a A A

The Analysis Of Genetic Data Structure Based On Mathematical Modeling

Posted on:2015-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:C J TanFull Text:PDF
GTID:2180330467461854Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The analysis of genetic data structure is an important part of bioinformatics, its objects ofstudy are mainly DNA sequences and protein sequences. The study of DNA sequences is ofimportant to the spatial structure, RNA sequences, mRNA and protein. Based on the thoughtof mathematical model, some novel methods are digging into the biological information ofdata structure, and these study results has some key role for the prediction of spatial structureand function of protein, and are benefit to the theoretical guidance of molecular evolution anddrug design based on structure, and also can bring huge economic benefits.The paper aims at the structure of single base, base pairs, triplet bases(codon), using acombined method of theoretical analysis and data processing, and some mathematicalmodelings are done systematically for genetic data structure, and then some studies are doneas follows.1. Based on RSCU, each parameters that influence the use of codon is calculated and d idsome statistical analysis. And then by using QRSCU which is a coding method, the bias ofp53codon is analyzed and designed. The result shows that the method based on QRSCUshowed its consistent bias of synonymous codon obviously, and p53genes prefer to usecodons ending with c/g. Finally, the genetic mutation of6genetic sequences are forecasted.2. The distribution of codon usage can reflect some biological characteristics, therefore itcan be a genomic signature. And the genomic signature based on CGR is used to study thesebiological characteristics and evolutionary relationship of alien invasive plants, and themethod reveals the usage condition of codon and T base of sequences of alien invasive plants.Then the dendrograms of clustering of genetic sequences of6alien invasive plants is alsogained.3. According to these indices of similarity analysis, such as average power spectrum anddigitized sequence, and taken the Hamming distance as the constraint factor, a method ofsimilarity analysis with multi-index based on structural clustering is proposed and used toanalyse the clustering structure and determine the optimal cluster by selecting7tumor proteinp53mRNA(complete cds) as an example of application.4. Combined the studies of digitized single base, base pairs and triplet bases, a novel4Dgraphical representation method of DNA sequences is put forward. The method can reflect thebiological information features of DNA sequence more comprehensively and effect ivelywithout any losses. Based on this method, the paper uses the geometric center of4D graph ofDNA sequence as eigenvalue of DNA sequence analyses, which keeps the original datafeatures, and establishs the Euclidean distances and included angles between vectors’ terminalpoint for similarity analyses. And then the clustering analysis graph of systematic hierarchicalis also gained.The innovations of paper are listed as follows:1. The method based on QRSCU shows the consistent bias of p53codon to synonymouscodon.2. The method of similarity analysis with multi-index based on structural clustering not only keeps the completeness of genetic information, avoiding to error caused by the singleindex of similarity analysis, but it also has the trait of co nsistent clustering, and the result ofclustering accords with biological taxonomy.3. Based on the research of digitized base, and introduces the geometric center aseigenvalue of DNA sequences, the mathematical model of4D graphical representation isestablished, and it can reflect the biological information features of DNA sequence morecomprehensively and effectively without any losses, and then establishes the graph ofsystematic hierarchical cluster analysis and results are almost in accord with biologicaltaxonomy.
Keywords/Search Tags:Bias, CGR, Structural Clustering, Similarity Analysis, Graphical Representation
PDF Full Text Request
Related items