Font Size: a A A

The Study On The Genomic Difference Degree Among Type Strains Belong To Different Taxa Of Bacterial Domain

Posted on:2022-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:G H LiuFull Text:PDF
GTID:2480306755971869Subject:Biology
Abstract/Summary:PDF Full Text Request
Since Woese put forward the three-domain theory in 1987,the taxonomy of prokaryotes has been based on the phylogenetic relationship and sequence consistency of16 S r RNA genes as the classification criteria for different taxons.With the development of sequencing technology and the increasing maturity of bioinformatics analysis technology,the number of bacterial genome sequences in the database is increasing explosively.Whole genome sequence analysis has become an irreplaceable indicator in prokaryotic systematics.Average nucleotide identity(ANI,95%?96%)calculated in genomic nucleic acid sequences and DNA-DNA simulated hybridization value(DDH value,70%)have become the gold standard of taxon at species level.However,there are few reports on the degree of genomic diversity between genera and higher taxa,and there is no unified standard at present.The quantitative analysis of the degree of difference between bacterial genomes among advanced taxons will effectively promote the development of prokaryotic systematics and systematic genomic taxonomy.Based on this,in September 2020,the whole genome sequences(9229 species)of the existing bacterial domain type strains in Gen Bank were downloaded,and the quality was evaluated by Check M,and 8578 high quality genomes(completeness>95% and Contamination<5%)were obtained.Furthermore,the latest taxonomic status of the corresponding strains was checked based on the bacterial list website LPSN(List of Prokaryotic names with Standing in Nomenclature,LPSN,https://www.bacterio.net).The results showed that these strains were distributed in 37 phyla,71 classes,170 orders,423 families and 2191 genera.Based on the statistical analysis of the whole genome sequence information characteristics of the type strain,it was found that the whole genomic size of bacteria in the bacterial domain was 0.58?13.61 M bp,G+C% content was 22.41%?75.86%.According to the statistical analysis of all the complete genome,it was found that the number of bacterial replicons was 1?9.The degree of difference of G+C% content in different taxa are0%?16.77% inter species,0.01%? 26.86 % inter genera,0%?32.79% inter families,and0.02%?38.2% inter orders.The differences in genome size of different taxa are 0%?55.49%inter species,0.08%?85.40% inter genera,0.04%?87.03% inter families,and 0.92%?82.76%inter orders.The average nucleotide identity(ANI)of different taxa(including inter-orders,interfamilies,inter-genera and inter-species)is calculated by Ortho ANI.The results of ANI values between genomes in the bacterial domain show that the distribution range interspecies is 61.39%?98.23%,the median is 75.12%,and the mean is 75.1%;the distribution range inter-genus is 61.64%?88.05%,the median value is 68.57%,and the mean value is69.2%;the distribution range inter-families is 60.13%?81.24%,the median value is 65.49%,and the mean value is 66.31%;the distribution range inter-orders is 59.23% %?79.42%,the median is 66.05%,and the mean is 65.94%.The average amino acid identity(AAI)among different taxonomic units is calculated by Compare M.The results of AAI values between genomes in the bacterial domain show that the distribution range inter-species is 43.77%?98.57%,the median is 71.57%,and the mean is 71.80%;the distribution range inter-genera is 45.23%?91.03%,the median value is61.70%,and the mean value is 62.02%;the distribution range inter-families is42.57%?74.18%,the median value is 54.43%,and the mean value is 55.02%;the distribution range inter-orders is 41.88%?60.88%,the median is 51.97%,and the mean is52.07%.Comparing the calculated results of ANI and AAI,it is found that the fluctuation range of inter-species ANI and AAI values is the largest,and the distribution of inters-orders data is the most concentrated.Both ANI and AAI have a certain degree of overlap among different taxons,and the overlap of the higher taxon AAI inter-genera,inter-families and inter-orders is significantly lower than that of ANI,so AAI has higher resolution in the higher taxon.The correspondence analysis of ANI value and AAI value shows that with the increase of classification grade,the linear correlation is worse and the ratio between the two values shows a downward trend.Analysis of genomic diversity among different taxa of Alphaproteobacteria class,Actinobacteria class and Bacilli class with high species diversity.The results showed that in the class Alphaproteobacteria,the average ANI values of inter-orders,inter-families,inter-genera and inter-species are 65.98%,68.08%,70.95% and 77.45% respectively,and the average AAI values are 51.76%,56.79%,64.42% and 74.49% respectively;in the class Actinobacteria,the average ANI values of inter-orders,inter-families,inter-genera and inter-species are 72.20%,69.21%,67.91% and 76.76% respectively,and the average AAI values were 52.61%,56.57%,62.41% and 71.7% respectively;in the class Bacilli,the average ANI values of inter-orders,inter-families,inter-genera and inter-species are 63.33%,64.50%,66.34% and 70.91% respectively,and the average AAI values were 49.87%,53.36%,58.42% and 67.55% respectively.Further analysis showed that the difference of G+C% content within genera was the most obvious.Among families and orders,the AAI values of Alphaproteobacteria have obvious boundaries of families(60%)and orders(54%).In Actinobacteria and Bacilli,there is a lot of overlap between inter-families and inter-orders in the data range of ANI and AAI values.In the genome identity analysis of Alphaproteobacteria and Bacilli,it was found that the inter-species ANI value and AAI value showed a strong linear correlation(R2?0.88).Combined with species tree,genome relationship index(ANI value and AAI value),16 S r RNA gene phylogenetic relationship and homology value and genome information,it was suggested to reclassify the species with problems with the existing taxonomic status.In summary,this study systematically analyzed the differences of whole genome sequences among strains with different taxon patterns in the bacterial domain at the level of nucleic acid and amino acid,which provided a basis for the genomic classification of different taxa in the bacterial domain,and the taxonomic reference for species that are difficult to determine taxonomic status in metagenome research.
Keywords/Search Tags:bacterial domain, type strain, whole genome sequence, average nucleotide identity, average amino acid identity
PDF Full Text Request
Related items