Font Size: a A A

The Research On Analysis Methods Of DNA Sequences And RNA Secondary Structures Based On New Representations Models

Posted on:2011-08-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z CaoFull Text:PDF
GTID:1100360308968946Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Human Genome Project (HGP) and advancement of gene sequence, structure and functioning study, more and more bioinformatics data are generated. The enormous data are typically processed efficiently by automated modern analysis approaches. Similarity analysis provides the sequential and structured information to infer or estimate the structure, functioning and evolution relation, hence becomes a fundamental study subject of bioinformatics. The sequence or structure analysis consists of similarity analysis, mutation analysis, phylogenetic analysis and function analysis, which are based on similarity analysis of sequence and structure. Therefore, the dissertation proposed the methods of similarity analysis of DNA sequence and RNA secondary structure based on the new representation models of DNA sequence and RNA secondary structure, it proposed methods of mutation analysis and construction of phylogenetic tree at the same time.The dissertation reviewed the recent advances of sequential and structure analysis advances first, and then study graphical representation of DNA sequences based on dual nucleotides, numerical coding method of RNA secondary structure, the methods of the analysis of mutation and structure alignment based on the coding sequences of RNA secondary structure, sequence similarity analysis based on graphical representation and construction of phylogenetic tree.The main contents are listed as follows:a) The author proposed a 3D graphical representation of DNA sequences based on physical and chemical properties of dual nucleotides, and gave a similarity analysis. It is known that the dependency and interaction between bases are very important for determining the structure and function of the sequences. To give a simple and intrinsic visualization of gene sequences, the dissertation proposed a 3D curve representation of DNA sequences with a dissimilarity measure of sequences based on geometric center covariance matrixes. The experiment showed the proposed approach can measure the similarity of sequences precisely which helps further infer the relation and relationship of species, especially those between human and other species. It may help discover human mechanics based on studies on other species as well.b) The author proposed a sequence comparison method and similarity analysis method based on coding scheme. According to DNA coding principle, the dissertation proposed a method that solved four basic problems. It could make analysis of similarity between DNA sequences. The coding method of sequence, which demonstrates sequences efficiently and makes the analysis of mutation visible, helps find out mechanism of diseases. Besides, the coding method provides a better mathematical model to figure out the similarity or dissimilarity between DNA sequences, in the sense that it improves genetic test and the prediction of gene functions.c) The author proposed a coding scheme of RNA secondary structure, and gave mutation analysis and structure comparison based on coding and XOR operator. The representation of RNA secondary structure is very complex and easily degenerated. The proposed coding method and its extension can well separate the free base and base pair, and distinguish the different structures including pseudoknot. Based on three digits coding, the dissertation presented RNA secondary structure comparison method, analysis method of mutation. And the dissertation proposed a novel structure comparison method based on coding rules. The experiment showed the excellence of the method.d) The author proposed two novel phylogenetic tree construction methods based on fuzzing clustering and minimum spanning tree that essentially make use of the proposed similarity and dissimilarity matrix.
Keywords/Search Tags:Sequence similarity analysis, Clustering analysis, Phylogenetic tree, DNA sequence, RNA secondary structure, Mutation analysis, Structure alignment
PDF Full Text Request
Related items