Font Size: a A A

Compositional And Comparative Analysis Of Microsatellites In Organism Genomes

Posted on:2015-12-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y ZhaoFull Text:PDF
GTID:1223330431950333Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
Microsatellites were originally famous for causing many human neurological diseases, and they were widely applied in biology field as a molecular marker subsequently. However, both experimental biology and computational biology research on microsatellites were mainly aimed at organisms with large genomes, such as eukaryotes and prokaryotes. Virus is a microscopic particles, cannot survive in the natural environment, parasitizing in the living cells. By far, reports related to virus microsatellite researches were rare. In this dissertation, the author analyzed and compared the microsatellites among eukaryotic, prokaryotic and virus genomes, especially the virus genomes by using the existing public genome and nucleic acid sequence database resources, and by means of mathematics, statistics and computer technology. Additionally, the author discussed the possible roles the microsatellites playing in the mechanisms of pathogen infection and host defense.This dissertation has mainly completed the following work:1. Comparative analysis of microsatellites in organisms covering from virus to human (Chapter2)32segmental sequences, which include Homo spiens, Oryza sativa, Streptococcus sanguinis, Ranid herpesvirus1and so on, covering animal, plant, fungus, protist, bacteria, archaea and virus, were randomly selected. The author analyzed the microsatellite distribution in these organisms, and discussed the possible roles the microsatellites playing in genome-size keeping and expanding. The author found that, after a very long history of evolution, microsatellites remain approximately60%of genome sequences through all species despite of the organism is higher or lower; even more, the microsatellites account for75%of the D. discoideum genome. It is suggested that the microsatellites greatly contributed to the genome organization. Also, it seems that there is a driving force which promotes genomes forming and fixing much abundant repeat sequences, especially short repeats. However, the iteration of repeats cannot increase unlimitedly. Therefore, the author introduced a concept-pressure border, mainly aiming at animal and plant, a length barrier the repeat should break through when iteration increasing. The selective pressure border was found to be approximately12bp in animal and plant species; Plasmodium vivax and Leishmania infantum, belonging to protozoa, whose genome size are relatively big comparing with fungus, other protist, bacteria, archaea and virus species, showed the similar pressure border as animal and plant. Except the two mentioned species, the rest of protists and fungi showed pressure border of repeats with10approximately, and8more or less in bacteria. In our opinion, the repeating is another important factor to make the genome variant.2. Analytical research on microsatellites in virus genomes (Chapters3-6)Standing on the height of the whole viruses, the author selected257type species from1938virus species. Genome sequences with diverse lengths make it possible to investigate the relationship between genome size and accumulation of microsatellites. Genome size is an important factor in affecting the microsatellite level; moreover, there is a positive correlation between them. Overall, the larger the genome size, the greater tolerance capacity of the long repeats. Additionally, hosts are also responsible for the variances of microsatellite content to a certain degree. For example, with the similar genome sizes, viruses infecting vertebrates and invertebrates tend to be higher than viruses attacking bacteria in microsatellite content. We inferred that maybe viruses combined partial genome sequences of hosts in infecting, resulting in relatively large genomes and high content of microsatellites. Evolutionarily speaking, it is the result of selection in the process of interaction between virus infection and host defense. In other words, it is the result of coevolution between repeat level and genome size, virus and host. Virus is a group of parasite, so studying of microsatellites in viruses is helpful to the research of many etiopathogenesis of its hosts.(Chapter3)The data of Chapter4is the same as the Chapter3, it was analyzed from another perspective. Principle component analysis showed that, the shorter repeat unit (mono-~tetra-SSRs, denoted by SSRs’-4) is the factor what affect microsatellites diversity the most among viruses, and followed by the long repeat unit (mainly hexa-SSRs, denoted by SSRs6). By consulting literatures, the author learned that, the SSRs1^4were often found in host-adapted prokaryotic pathogens with reduced genomes, and were not known to readily survive in a natural environment outside the host. In contrast, SSRs5~11were found mostly in nonpathogens and opportunistic prokaryotic pathogens with large genomes. The virus genome is usually smaller than the prokaryote’s, so the author found a lot of SSR1~4in viruses, and rare SSRs>5. Presumably, three factors have to exist to explain this phenomenon:(i) relatively small genomes are incapable of holding microsatellites with long repeat unit;(ii) a mutational bias promote expansion of the SSRs>5in few viruses;(iii) a lack of strong negative selection against the SSRs>5.(Chapter4) The author selected85subtypes of influenza A virus as data material based on the integrality of genome sequences aiming at investigating the microsatellites. The author estimated the difference of simple repeats between the original sequences and the corresponding randomly generated sequences by using the paired-samples t test and statistically plotting, and found that the differences all achieved an extremely significance during the paired whole-genomes and eight segmental sequences; moreover, simple repeat is underrepresented in the segment7, but overrepresented in the other segments. Regression analysis showed a cubic relationship between the iterations and the content of repeats. The author also compared the results with others’, and discussed the possible role the simple repeats playing in the genome variant. Results showed that repeats are widespread in the genome of influenza A virus, and evenly distributed in the individual segments without local accumulation. The genetic diversity of influenza A virus is characterized by a complex interplay between frequent ressortment and individual segment variant. Wherein, simple repeats have greatly contributed to the individual segments variation. The discovery of overrepresentation and (or) underrepresentation of simple repeats in different segments may provide some new clues to the ressortment mechanism, infection process, and evolution analysis.(Chapter5)In this protocol, the author surveyed the relative density, relative abundance and base composition of the microsatellites in45plant viruses-Potyvirus species. Moreover, the results were compared with the Human Immunodeficiency Virus (HIV). Results here indicated that the repeat densities and relative abundances are very similar in each of these surveyed sequences. Interestingly, both of the two values also show great similarity among HIV genomes. Both Potyvirus and HIV are pathogenic RNA viruses and they are similar in genome size, genome structure, SSRs density, SSRs abundance and SSRs composition. One of the differences is the host type they infecting, Potyvirus species usually infect plants while HIVs attack human mainly. Although they infect different types of hosts, the similarity of aforementioned aspects suggests that the occurrence of SSRs is a non-random event, which indicated that maybe Potyvirus and HIV genomes have the similar evolution mode and parallel evolution level.RNA is the genetic material of the Potyvirus species, which encodes products including structural proteins as well as functional proteins, and eventually determine the phenotype and the pathogenicity of the viruses. Tiny differences of base composition in the RNA may promote the viruses evolved towards different directions. This study proved the genetic diversity of the Potyvirus species from the genome level.(Chapter6)...
Keywords/Search Tags:Microsatellites, Virus, Genome, Chemical bases, Short tandem repeat, Repeat, Iterations, Comparative genomics, Evolution
PDF Full Text Request
Related items