| Background:Syphilis,caused by Treponema pallidum,,is a chronic systemic sexually transmitted disease.The mechanism of persistent infection has not yet been established.The highly variable tprK gene of T.pallidum has been acknowledged to be one of the mechanisms that causes persistent infection.Based on the early studies on the heterogeneity of the tprK gene,rabbit-derived strains have been used as research objects.Few studies have focused on the actual variable characteristics of the tprK gene under the development of natural human infection.And in the previous studies,clone-based Sanger sequencing approach have been adopted,actually,this traditional method is insufficient for analyzing this highly diverse gene.Methods:This study collected 28 lesions from patients with primary and secondary syphilis who were admitted to the outpatient clinic of Zhongshan Hospital,School of Medicine,Xiamen University.Then we employed a more sensitive and reliable approach,next-generation sequencing(NGS),to explore the tprK gene of T.pallidum directly from clinical syphilis patient samples instead of rabbit-derived samples and to gain better insight into the profile of tprK in the context of human infection.Results:Our results showed that the overview of diversity in tprK gene during nature human infection was a mixture of distinct sequences within each V region of tprK,containing a predominant sequence and numerous minor variants within each V region,which most variants were low-frequency at 1-5%far beyond the detection limit of Sanger sequencing.Comparing the diversity within seven variable regions in the tprK gene of the strains from primary syphilis samples,we found more variants within the variable regions of tprK among the strains from secondary syphilis samples,and the frequencies of predominant sequences within each variable region were generally decreased,with an increasing distribution of minor variants at the frequencies of 10-60%.Interestingly,when analyzing the length of distinct sequences obtained for each variable region,the sequences length within the regions were differed by only 3 bp or multiples of 3 bp.In addition,amino acid sequence consistency within each variable region was found among the 28 strains.Among the regions,the predominant amino acid sequences IASDGGAIKH and IASEDGSAGNLKH(frequency above 80%)in V1 presented a relative stable with a high proportion within inter-patient sharing.Conversely,the amino acid sequences in V6 demonstrated remarkable variability at intra-and inter-patient levelsConclusion:The seven variable regions of the tprK gene demonstrated high diversity during the course of syphilis infection;they generally contained a high proportion sequence and numerous low-frequency minor variants,most of which are far below the detection limit of Sanger sequencing.And the characteristic profile of tprK was different in the context of primary and secondary infection,demonstrating that throughout the development of the disease,T.pallidum constantly varies its tprK gene to obtain the best adaptation to the host.The rampant variation in each variable region was regulated by a strict gene conversion mechanism that maintained the length difference to 3 bp or multiples of 3 bp to ensure expression of a normal TprK.Interestingly,the tprK gene always maintains a feature in these ongoing variations in the course of infection,that is,having a relatively conserved region(V1)and a highly diverse region(V6).These findings could provide important information for unveiling the mysterious role of tprK in persistent syphilis infection and for further exploring the promising potential vaccine components. |