Font Size: a A A

IC-PIC Matrix Method And Its Applications In Phylogeny Of DNA Viruses

Posted on:2013-01-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y GaoFull Text:PDF
GTID:1220330374970667Subject:Theoretical Physics
Abstract/Summary:PDF Full Text Request
Phylogenetic studies based on marker genes have dramatically accelerated our understanding of microbial diversity. Viruses do not contain a ribosomal gene. And genetic markers such as polymerase and capsid genes are rare and difficult to be identified. For this and other reasons, phylogeny of viruses is not yet ripe for Linnean christening. Recently, phylogenetic studies using information from whole genome have been receiving extensive attention. However, sequence alignment methods are not directly applicable to whole genome, and how to compare two genomes has become a serious problem in phylogeny.An IC-PIC matrix method is proposed to infer phylogenetic relationships from complete genome. The method is based on nucleotide correlation property of DNA sequence with sequence alignment. It avoids the problems in phylogenetic studies by sequence alignment methods. Our method is parameter-saving and insensitive to sequence length. A key point in our approach is the selection of optimal PICs for phylogeny inference, which highlights the shaping role of natual selection. The applications of IC-PIC matrix method in phylogeny are discussed in this paper, and the obtained results are compared with previous studies.IC-PIC matrices are constructed based on the results of statistic tests, and are used to study the mitochondrial genome phylogeny of mammals. With marsupials and monotremes as the out-group, our work reconfirms the hypothesis of (Ferungulates,(Primates, Rodents)), and provides another piece of evidence for the existence of non-monophyleticity of rodents. At an even finer resulation, species in each order form their own monophyletic group, in agreement with biologist’s systematics.IC-PIC matrices constructed by FA(k)A、FA(k)T、FA(k)C、FT(k)A、FC(k)A、FG(k)A and Dk+2are used to recover the whole-genome phylogeny of218double-stranded (ds) DNA viruses from13viral families and59papillomaviruses. The IC-PIC trees classify the viruses into clades which remarkably agree with International Committee on Taxonomy of Viruses (ICTV) systematics only with four exceptions: Cercopithecine herpesvirus-5(CeHV-5) jumps out of the clade of the Cytomegalovirus genus, equid herpesviruses (EHV-1and EHV-4) are separated from other members in the Varicellovirus genus, RFV and MYXV are not related phylogenetically, and Acanthocystis turfacea virus (ATCV-1) is separated from Paramecium bursaria viruses. The IC-PIC trees give potential evolutionary relationships among some viral families. Particularly, lipothrixviruses and rudiviruses are positioned nest to each other, in support of the suggestion that assign Lipothrixviridae and Rudiviridae to Ligamenvirales order. MSEV is tentatively assigned to genus Betaentomopoxvirus by ICTV. Our result confirms the assignment. Moreover, the IC-PIC tree predicts the taxonomic positions of "unclassified" viruses TuHV-1, CavHV-2and NeabNPV.IC-PIC matrices are used to recover the whole-genome phylogeny of single-stranded (ss) DNA viruses from6viral families and Parvoviridae. The IC-PIC trees classify the viruses into clades which are in agreement with ICTV systematics only with two outliers:BFDV jumps out of the group of circoviruses; and Chp2goes out of chlamydiamicroviruses clade. The virus-host linkage topology of dependoviruses and begomoviruses indicate the existence of virus-host coevolution history. And, in the IC-PIC tree, the non-monophyleticity of tomato viruses is observed. The IC-PIC trees give potential evolutionary relationships among ssDNA viral families. The branching of nanoviruses, circoviruses and geminiviruses has evoked the suggestion that these viruses evolve from a common ancestor. Currently, MV-L1, SVTS2, B5, BPV-3, BPV-2, AAV-7and AAV-8are tentatively assigned at genus level by ICTV. Our results confirm the corresponding assignments. Moreover, the IC-PIC trees predict the taxonomic positions of "unclassified" viruses DpDNV, AAAVa, CFDV, BgDNV, MpDNV, AdDNV, PcDNV, CpDNV and PmDNV.The unique aspects of our work include:(ⅰ) signifying genomes using information correlation (IC) and partial information correlation (PIC), and revealing the phylogenetic signal properity of IC and PIC,(ⅱ) a statistic study of the species-specific property of the PICs,(ⅲ) a modified bootstrap support analysis of the branching orders in the IC-PIC tree, and (ⅳ) the phylogeny of6ssDNA viral families.
Keywords/Search Tags:phylogeny, IC-PIC matrix method, information correlation, partialinfomation correlation, DNA viruses, genome
PDF Full Text Request
Related items