About The K Series Of A Tentative Model Of Evolution And Sequence Only Reconstruction Problem

Posted on:2010-06-10

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Q Li

Full Text:PDF

GTID:1220330395951551

Subject:Theoretical Physics

Abstract/Summary:

PDF Full Text Request

A heuristic probabilistic model for the evolution of string composition of biological sequences is proposed, which relates the relaxation of correlation of string compositions to sequence divergence. It explains the effectiveness of phylogenetic methods based on string compositions, and is used to calibrate the results of CVTree, and estimate the working range of the parameter, the length of strings K. It suggests that the CVTree approach can be generalized to a large family, and thereby larger Ks can be used. Two sets of independent methods for distance estimation, solely based upon the presence or absence of K-strings, are developed, which yield phylogenetic trees consisting surpris-ingly well with current taxonomy.The justification of K-string methods inspired a problem of unique recon-struction of a sequence from its constituent K-strings, which is equivalent to the uniqueness of Eulerian paths in a directed graph. Uniquely reconstructible sequences form a factorial regular language. It is thoroughly characterized by its forbidden words, and a deterministic finite automaton (DFA) accepting it is built up. It provides an efficient on-line algorithm for testing the unique reconstructibility of the sequences, which has been applied to investigate the real protein database.

Keywords/Search Tags:

Compositional vector (CVTree), Evolutionary distance, Evo-lution Tree, Sequences reconstruction, Eulerian path, Factorial language, Minimal forbidden words

PDF Full Text Request

Related items

1	The use of alignment-free statistics for the evolutionary study of 5' cis-regulatory sequences
2	Phylogenetic Analysis Based On Selected Evolutionary Distance Of DNA Sequence
3	Studying The Evolution Of Gene Sequences By The Phylogenetic Tree
4	Pattern Avoidance In Matchings And Involutions
5	Properties Of Some Words
6	Study And Improvement Of Methods Of Constructing Phylogenetic Tree
7	Classification Of Gene And Protein Sequences
8	Reconstruction Of Evolutionary Relationship Of Avian Genome And Classical Evolutionaryequation
9	The Road Grid Arrangement With The Prohibitions
10	Minimal Path Cover Problem. Qt-graph