Font Size: a A A

Coronavirus Phylogeny Analyses

Posted on:2006-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:W X ZhengFull Text:PDF
GTID:2120360182475905Subject:Biophysics
Abstract/Summary:PDF Full Text Request
With the rapid development of human and other model organism genome-sequencingprojects, more and more whole genomic sequences are available. It's reasonable to believelarge amount of sequence data will lead to a great breakthrough in theory. But the data are notequal to knowledge, how to extract useful information is a most important problem. Z-curveis a powerful tool to analyse the genomic sequences intuitively. Based on the Z-curve theory,a genome can be uniquely represented by a three dimensional space curve. The global andlocal features can be grasped at a glance. Z-curve is the theoretical foundation of the paper. Itis introduced briefly. We proposed a geometric method to analyse the coronavirus phlogenybased on the Z-curves of the whole genomes.Phylogenetic relationships among different organisms can be analyzed at molecular level.Different from analyses based on morphological features, the analyses based on geomicsequences is more natural. The phylogenetic analyses based on sequence alignment andcomposition vector method on whole proteme are introduced briefly. They all shows that lifeon this planet divides into three domains: Archaea, Bacteria and Eucarya. Instead of sequencealignment, we proposed a new geometric method to analyse coronavirus phylogeny based onthe 3-D Z-curves of whole genomes. The present method has the merits of simpleness andintuitiveness. The similarity of Z-curves implies evolutionary relationship of the organisms.The evolutionary distances are gained by comparing the distribution patterns of 3-D Z-curves.The distribution pattern is approximately described by the geometric center and the threeeigenvectors, which can resolve the problem of different genome lengths. At the same time,lots of information carried by the Z-curve has been lost. But the result is still satisfactory. It isreasonable to believe if more information of the Z-curve can be used more accurate result canbe obtained. In this sense, the current method has great potential to develop. Thephylogenetic analysis is done at nucleic acid level using the whole genome sequences. Thusone can avoid choosing which gene to be aligned. The evolutionary characters of the wholegenome can be reflected instead of some individual genes. The result is more accurate andobjective. Though the method has some merits, it is still in its premature stage.Additionally, we proposed an algorithm to predict the cleavage sites of polyprotein ofcoronaviruses, with full consideration of the highly conservation of the cleavage sites and thelength of nonstructural proteins cleaved by the 3C-like proteinase. The phylogeneticrelationship obtained relying on the predicted 3C-like proteinase is well consistent with theestablished taxonomic groups of coronaviruses.
Keywords/Search Tags:Sequence alignment, K-string composition vector, Z-curve, Phylogenetic tree, Coronavirus, Severe acute respiratory syndrome, SARS-CoV, Polyprotein, Cleavage sites
PDF Full Text Request
Related items