| Klebsiella pneumoniae is a rod-shaped,nonmotile,Gram-negative bacteria,which was first isolated in 1893 from the lung tissue of a Patient with lobar pneumonia by Friediander,it’s an opportunistic pathogen which can easily colonize in the respiratory tract and intestinal tract of hospital patients,it can normally result in pneumonia,meningitis,liver abscess,endophthalmitis,urinary system inflammation,wound infection and Systemic sepsis,etc,which brings a lot of trouble to the clinical treatment.Especially recently,due to the spread of multidrug-resistant K.pneumoniae,the treatment of infections caused by K.pneumoniae has become much more difficult,therefore,people are paying more and more attention to the study of K.pneumoniae.In addition,K.pneumoniae can be applied to the industrial production of important platform chemicals by fermenting glycerol,such as 1,3-propanediol and 2,3-Butanediol,and due to its high yield and productivity,K.pneumoniae has been recognized to be the most promising industrial micro-organisms of 1,3-propanediol production in the all wild bacteria.K.pneumoniae is a multi-lineages bacteria of many phylogenetic origins,but the connection between its phylogeny and pathogenicity is now under analyzing,which is also a hot topic of current study.With the development of the second generation highthroughput sequencing technology,the number of sequenced K.pneumoniae genomes is booming,which makes it’s possible to analyze the phylogeny of K.pneumoniae with genomic big data and thus we can analyze the evolutionary relationships among the various lineages of K.pneumoniae in a more comprehensive way.However,the several current used phylogenetic analysis methods can’t meet the need of big data analysis,thus it’s urgent to establish a reliable and accurate phylogenetic analysis method to obtain the phylogenetic relationship of K.pneumoniae.Based on the analysis of the composition and characteristics of core genes of K.pneumoniae,we established a new phylogenetic analysis method and compared it with the current methods to verify its reliability and accuracy,then we performed a phylogenetic analysis of K.pneumoniae based on genomic big data by using this method,the main results are as follows:(1)We determined the size of the core genome of K.pneumoniae is about 3.0Mbp,the number of core genes is 2,735.Based on the analysis of the composition of these core genes,we found that most of these core genes are endogenous genes,the foreign genes is only in minority.Likewise,most of these core genes are conservative,whose evolutionary rate are low,and the length of most core genes are between 500 bp and 1500 bp,only a small part of core genes belong to virulence genes.(2)Based on the analysis of core genes and core genome SNPs,we created 4 new phylogenetic analysis methods by selecting 50 core genes,which are random selection method,GC content deviation rate selection method,mutation rate selection method and comprehensive selection method,respectively.(3)We compared and analyzed the phylogenetic tree of these new created methods and found that the phylogenetic trees based on random selection method,GC content deviation rate selection method and mutation rate selection method are not as accurate as the phylogenetic tree based on SNPs matrix,which indicates that these methods don’t own sufficient resolution and are not applicable to the following phylogenetic analysis.While the phylogenetic tree based on the comprehensive selection method is more accurate than that of SNPs matrix,and its bootstrap values are also very high,thus we presumed that comprehensive selection method is more reliable and accurate than the currently used methods.(4)We confirmed that our new method is not only accurate and reliable,but a convenient phylogenetic analysis method which can be applied to big data analysis,we also successfully performed a phylogenetic analysis of 950 K.pneumoniae genomes based on this new method.In addition,the new method was also confirmed to be applicable to the species delineation of klebsiella genus,by introducing the type strains of species and subspecies of klebsiella genus,we identified the taxonomic affiliation of the non-K.pneumoniae clades in the phylogenetic tree.(5)We confirmed that all lineages of K.pneumoniae were evolving independently except CG258,we also proved that ST395 isolates not only belong to CG258 but may originate from early ST11 isolates,ST11 is likely to be the common ancestor of CG258.In addition,we also found 3 new CG258 members(ST340,ST437 and ST855).(6)We investigated the epidemiology of K.pneumoniae and found that ST258 is the fastest spreading and widely distributed lineage of K.pneumoniae,while the spread speed of hyper-virulent lineages are relatively slow,and the evolution of these lineages and virulence of K.pneumoniae are relatively independent. |