| Yersinia pestis is the causative agent of plague. In the recorded history, three waves of human plague pandemics have led to the death of millions of people and induced major social changes. Until now, Y. pestis has been found in more than 200 species of wild rodents inhabiting plague foci in all the continents except Australia and Antarctica. Because of its characteristics, Y. pestis is included in the selected list of the bioterrorism-related agents. Human plague has been well controlled in China since the 1950s, but at least 12 types of natural plague foci still exist, covering 241 counties in 15 provinces. Therefore, it is significant to provide insight into the Y. pestis.The experiments for investigating genomic polymorphisms of Y. pestisThe highly polymorphisms of bacterial genome are caused by interactions between the environment, the hosts, vectors and the pathogen, which is based on the natural selection against the accumulation of small changes within genome. The principal molecular strategies for bacterial genomic polymorphisms include single nucleotide mutation, lateral gene transfer and deletion, genome rearrangement and short repeat polymorphisms. Whole genome sequencing technology and comparative genomics will show us the diversity between different strains in extenso. However, it is obviously impossible to sequence the whole genomes of every interested isolates. Therefore, we employed SNP (Single Nucleotide Polymorphism), IS (Insert Sequence), DFR (Different Region), MLVA (Multiple Locus VNTR Analysis) and CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) methods for scanning the genomic polymorphisms of Y. pestis based on their polymorphism forming mechanisms. The experiment methods used in this research include PCR amplification, agarose gel electrophoresis, sequencing and bioinformatics analysis.In this research, we acquired 45 SNP loci information in 126 Y. pestis isolates, 39 IS in 84, 23 DFR in 909, 23 MLVA in 366, and 3 CRISPR in 131. The fundermental genomic polymorphism data of Chinese isolates have been acquired. The discrimination power, reproducibility and operation time presented that MLVA was the most suitable methods for quick identification and source tracking of Y. pestis. Besides the research on Chinese isolates, we acquired CRISPR data for the isolates from former Soviet Union and Mongolia through international cooperation. It was found that the distribution of spacers and their arrays in Y. pestis strains is obviously region-specific, leading us to construct a hypothetic evolutionary model of Y. pestis. This model suggests the main transmission route of Y. pestis that encircled Takla Makan Desert and ZhunGer Basin.The design and construct of genomic polymorphism database and associated softwareThe large amount of genomic polymorphism data can be utilized with high efficiency after being stored in specific designed database. Based on different data format of the SNP, IS, DFR, MLVA and CRISPR results, we designed and created the multi-module database, which included the 5 groups of polymorphism data and the background information of associated strains. The modules were connected and could be invocated each other. The DBMS (Database Management System) of database can be used for adding, editing, deleting and searching records in database. Database can be maintained and updated easily and will provide data support to other programs with excellent compatibility.The polymorphism experimental results, such as nucleic acid sequences produced by SNP analysis, can not be analyzed directly before converting. Manual process of these data will take a lot of time, which even more than the time taken by experiment itself. Therefore, a series of software were editted for dealing with these problems. Using these softwares, the experimental results can be quickly converted to the standard format data for easy comparison, analysis and storage. Moreover, these software can be used to identify the polymorphism loci differences between unknown isolates and reference isolates pre-stored in the database.Based on characteristics of VNTR loci, we designed core algorithms used for clustering analysis of MLVA results. Furthermore, the Y. pestis source-tracking system, which can be used for drawing phylogenetic tree and computing SI (Similarity Index), was constructed based on these algorithms and machine-learning approach. For ensuring the accuracy of source tracking, standard experimental operation guide was integrated into the system. The function of constructing phylogenetic tree can be conveniently used in typing and evolution analysis of any pathogens.All the database and software were designed and constructed based on simplification principle. Software system applied general Windows interfaces, and most functions can be executed by"one button principle".ConclusionThe genomic polymorphisms data of Chinese Y. pestis isolates were acquired and stored in the specifically designed database. These informations, together with quickly source tracking system, will help to track the source of outbreaks, as well as to implement effective prevention and treatment countermeasures during disease epidemics or bioterrorism attack. Database and associated software are highly compatible and easily expanded to be used for other pathogens, which provide a new approach for identifying, typing and evolutionary research of bacteria. The geographical locations of the collected strains in this research are distributed in all counties, as well as the fundamental data stored in database contain almost all important polymorphism loci in Y. pestis genome. These data will strongly promote the transmission and adaptive evolution research of the target bacteria. |