Tuberculosis (TB) is one of the top infectious diseases threatening human health. According to the WHO reports, there were8.7million new TB cases in the year of2011, of which310thousand were multi-drug resistant tuberculosis (MDR-TB), and1.4million people were killed by TB globally. China is one of the22countries with highest TB burden, with the second largest population of TB patients and the largest number of MDR-TB cases. Mycobacterium tuberculosis (Mtb) Beijing family is the major pathogen that causes most TB cases in China. This family, mainly found in the East and South Asia, has become the most widely expanded clade over the last century due to the migration of Asian around the world.In the previous study, our group further divided the Beijing family into7subgroups, based on the SNPs genotypes of3R (Replication, Recombination, and Repair) genes. One of the recently evolved subgroup, namely Bmyc10, is the predominant subgroup causing about80%TB cases in our country, while its phylogenetically adjacent counterparts, namely Bmyc26, is found in less than1%cases. The reason why these two closely related subgroups pose such disparate epidemic threats is the first question drawing our interest.In order to reveal the molecular mechanism leading to the epidemic of predominant subgroup, we performed high-throughput genome sequencing on35representative strains of5Beijing subgroups, based on the3R gene SNPs and16-VNTR genotypes of1448strains from6Chinese counties. Combined with4published Beijing genomes with worldwide origin as well as the standard strain of Mycobacterium tuberculosis H37Rv, we constructed the phylogenetic tree of Beijing subgroups and identified Bmyc10specific SNPs and Indels. The functional similarity analysis of mutated gene found predilection of cellular locations in cell membrane, cell wall as well as other regions which in frequent interactions with the host. By conservation site and positive selection analysis, we identified11genetic mutations with potentially substantial consequences. With reference to the transcriptome data of Mtb under134stresses, we found those mutated genes were induced by free fatty acids and reactive oxygen/nitrogen damage, which resemble the intracellular niche of host macrophages. Therefore, we concluded that, Bmyc10-specific mutations might function in a collaborative manner to enhance the fitness of Mtb under the stresses from host macrophages. By accumulating those adaptive mutations, Bmyc10paved its way to expansion. Epidemic study suggested that, acquired drug resistance leading to TB epidemic in some areas while the mechanism is not comprehensively understood. In the Second Chapter of the present thesis, we studied the in vivo micro-evolution of Bmyc10stains during clinical treatment, with emphasis on the evolution of acquired drug resistance.In order to study the development of Mtb from pan-sensitive to MDR strains under treatment stress, we selective7sputum samples from3patients in the Shanghai Municipal Center of Disease Prevention and Control (Shanghai CDC). By likelihood ratio test (LRT)-based analytical protocol, we found8-41unfixed mutations with frequency over5%in each sequenced sample, suggesting considerable genetic diversity of in vivo Mtb populations. With more comprehensive analysis of drug resistance genes, we found as many as4to5mutations emerged during the development of drug resistance, indicating that Mtb was capable of selecting multiple strategies to cope with multiple-drug treatment. As the treatment went on, only1resistance conferring mutation was selected, which suggested competitions (or clonal interference) among different resistant mutants. The discovery of clonal interference in Mtb emphasized the importance of early diagnosis and treatment in preventing acquired drug resistance. Besides, we also found19mutations with significantly changed frequencies,14of which exhibited in vivo adaptive potential to drug resistant Mtb.The genetic diversity of in vivo Mtb population is of great importance for theoretical studies. Its clinical relevance, however, remains to be explored. In the study of Chapter Two, we also found that effective treatment may reduce the number of characteristic SNPs in sputum while ineffective treatment did the opposite, suggesting the correlation between sputum genetic diversity and the effectiveness of clinical treatment. As the cost reduces, whole genome sequencing might replace current sputum tests and become a regular test for treatment schedule making and effectiveness assessment. Therefore, we focused on the quantitative assessment of Mtb genetic diversity and its relevance with treatment outcome in the Third Chapter.In order to explore the connections between treatment effectiveness and sputum genetic diversity, we selected a retreatment case of MDR-TB from the database of Henan Chest Hospital. The sputa at Week0,2,4,6,8and16were performed with ultra-high genome sequencing of more than1,000-fold coverage. The medical record suggested that the patient was administrated with ineffective treatment in the first4weeks, when the drug resistance profile was unclear. During the fifth week, the administration of amikacin, a second-line drug, successfully lowered the bacterial burden thereafter. Consistent with the treatment schedule changes, we found steadily elevated number of characteristic mutations, from33to50SNPs/103-fold depth, in3sputa during the first4weeks, as well as the decreased SNPs number to25SNPs/103-fold depth in the sixth week, of which the tendency lasted to Week16. This result suggested that the correlation between treatment effectiveness and sputum genetic diversity might be valid. By identifying significantly changed genomic mutations, we also found a synonymous mutation with steadily increased frequencies. With protein remained unchanged, this mutation increased codon preference for aspartate by2times, indicating that it may affect protein expression. We also found that the bactericidal effect of amikacin only lasted for less than a week. Since no drug resistance conferring mutations emerged during the treatment, we conclude that phenotypic resistance is leading cause of treatment failure. This result also suggested the utility of high-throughput sequencing for detecting and assessing phenotypic resistance, as well as guiding personalized medicine.Using high-throughput sequencing, we studied the micro-evolution of Bmyc10subgroup under different time scales. We also explored the feasibility of genome sequencing for guiding treatment schedule and assessing its effectiveness. The conclusion of our study may not only promote the micro-evolution study of Mtb, but also pave a new path for clinical application of high-throughput sequencing. |