| Widely distributed in water,sediments,aquatic organisms,Vibrio parahaemolyticus is a halophilic,Gram-negative bacterium.Vibrio parahaemolyticus cause food poisoning frequently in the world,mostly in summer occurring in coastal areas,and is an important pathogen of foodborne illness.Thermostable direct haemolysin and the TDH-related haemolysin are the key virulence factors in V.parahaemolyticus.V.parahaemolyticus can cause gastroenteritis,which is the main clinical symptom and self-limiting,wound infection and sepsis,.V.parahaemolyticus serotype O4:K12 mainly causes outbreaks of infection on the Pacific coast of the northwestern United States.In 2012,this serotype also caused outbreaks in Spain.The sequence is ST36.V.parahaemolyticus serotype O4:K12 also caused outbreaks in China.It is reported that this serotypes of V.parahaemolyticus also caused outbreaks in Japan,Vietnam,and Chile.However,the genetic characteristics of V parahaemolyticus serotype O4:K12 in China,and the relationship among the V.parahaemolyticus serotype O4:K12 isolated in China and other regions of the world,especially in the United States,have not been investigated.In this study,25 strains of O4:K12 serotype V.parahaemolyticus isolated in China were selected for whole genome sequence using Illumina HiSeq 2000 sequencing platform.The reads were assembled with SPAdes.To study genetic diversity of O4:K12 serotype V.parahaemolyticus,the genome sequences of 4 V.parahaemolyticus serotype O4:K12 were obtained from the public database and added to the newly sequenced genome for core genome and pan genome analysis.Core genome analysisThe sequences of seven housekeeping genes were extracted from the genome data of 29 V.parahaemolyticus serotype O4:K12 for MLST analysis.A total of 6 STs were detected in 29 V.parahaemolyticus serotype O4:K12.These STs dispersed in the minimum-spanning tree constructed by GrapeTree and formed 4 clusters.Using V.parahaemolyticus RIMD2210633 as a reference strain,MUMmer was used to detect single nucleotide polymorphism(SNP)in the core genome of the strains,and the phylogenetic trees constructed using RAxML/NJtree were used to study the population structure.Approximately 250,000 SNPs,which were detected in the core genome of 183 strains of various serotypes of V.parahaemolyticus isolated worldwidely,were used to construct a neighbor-joining phylogeny.The 29 04:K12 serotypes of V parahaemolyticus in this study formed 5 distant clusters.In order to further investigate the relationship among these 29 V parahaemolyticus serotype O4:K12 we constructed a maximum likelihood phylogeny of these strains with ahout 70,000 SNPs.Five lineages were observed.The V.parahaemolyticus serotype O4:K12 isolated in the America and Asia belongs to different lineages and were distantly related.In order to investigate the diversity of the V.parahaemolyticus serotype O4:K12 strains in Lineage 5,we constructed a maximum likelihood phylogeny of these strains by using 552 SNPs with the method mentioned above,and the strains in this lineage were further divided into 4 sub-lineages.We used ClonalFrameML to analyze the clonal relationships and the effect of recombination in V.parahaemolyticus serotype O4:K12.The ClonalFrameML analysis shows that on the long branches of the phylogeny there is often more than 50%of recombinant sites,whereas amongst the cluster of 22 closely related genomes,there is less than 5%of recombinant sites.The relative effect of recombination versus mutation is r/m=(R/theta)*delta*nu=1.74.Pan genome analysisGene prediction and annotation were performed using prokka,and pan-genome analysis was performed with Roary.Pan genome analysis showed that single strain specific and strain cluster specific genes were mostly located in the genomic islands found in 29 V.parahaemolyticus serotype O4:K12 and the vast majority of the accessory genes were acquired from within the Vibrio genus.And there are specific genomic islands in the American strains.The PCA analysis on accessory gene content clearly distinguished the strains isolated in America and Asia.The distribution of virulence genes,type Ⅲ secretion system and virulence islands in 29 V.parahaemolyticus serotype O4:K12 were detected using BLAST,and visualized using Perl script and EasyFig.Among the 29 V.parahaemolyticus serotype O4:K12,all strains were tlh-positive,vp100057 and vp110079 were tdh-negative,trh-negative,vp060467 was tdh-positive,trh-negative,and the remaining 26 strains were tdh-positive and trh-positive.The remaining 26 strains except vp060467,vp100057 and vp110079 contained the urease gene cluster,which was consistent with the distribution of trh in the strain.VPaI-1,VPaI-4(except vp060467),VPaI-5,VPaI-6 were absent in 29 V.parahaemolyticus serotype O4:K12;5 variant types of VPaI-2,7 variant types VPaI-3 were identified;VPaI-7 was present in vp060467,vp10057,vp110079,and MAVP-QPI was present in the remaining 26 strains.Multiple distinct genetic lineages existed in V.parahaemolyticus serotype O4:K12.Frequent recombination exists between different genetic lineages.There is genetic isolation between V.parahaemolyticus O4:K12 serotype isolated in the America and Asia.The genes of the accessory genome are mostly from within genus Vibrio. |