| Objective: AIDS(Acquired Immune Deficiency Syndrome,AIDS)is a terrible epidemic caused by Human Immunodeficiency Virus(HIV)infection,which is incurable and has no vaccine.In 2014,UNAIDS proposed "90-90-90" as the goal of global AIDS control.As of the end of 2019,China’s "90-90-90" data on AIDS prevention and treatment are 75.7%,89.7%,and 95.3% in order,with remarkable results.The transmission route of AIDS has changed to mainly sexual transmission,and the HIV epidemic among Men Who Have Sex with Men(MSM)has a severe impact.About one-third of infected people are undetected,and the situation of prevention and treatment for key populations is still serious.The higher the number of cases Currently,HIV molecular transmission network analysis is mainly applied to(1)reveal the key populations in the network to guide precise interventions;(2)reveal the direction of virus transmission between regions and reveal the key factors driving the spread of the epidemic;(3)evaluate the effectiveness of interventions to control virus transmission;(4)predict the future trend of HIV epidemic.Molecular transmission network analysis can evaluate the transmission risk of infected individuals by various indicators: number of Links,degree,Transmission Network Score(TNS),number of new cases of infected individuals in the network,etc.Link is the connection between two nodes,degree is the number of edges that each node in the network connects to other nodes,the greater the degree of an individual in the network,the more potential transmission partners it has;based on the degree can be calculated TNS,TNS between 0 to 1,the higher the score,the higher the potential transmission risk;newly infected person has a higher viral load,which represents a greater possibility of transmission,The higher the number of new infected person,the more active HIV transmission in the network.In addition,the regeneration index of molecular clusters,regeneration index combined with proportional detection rate and other network parameters can also be used to determine the risk of transmission in the network.Epidemiological surveys are used to study the distribution of diseases,health and health events and their determinants.The collection of information is the key to epidemiological surveys,questionnaires and statistical analysis of questionnaire results to provide accurate and reliable information for research.Epidemiological analysis of transmission risk also has some problems;there is a risk of privacy leakage of information exposure;influenced by traditional beliefs and social discrimination,infected people may conceal important personal information such as exposure history;risk factors obtained based on epidemiological surveys will ignore the heterogeneity of infected people themselves(such as subtypes,time of infection,etc.),and use the mean value of the study population to obtain the overall not individual-specific risk factors.It is worth noting that previous studies of molecular network analysis were mainly on subtype B strains,and HIV prevalent strains in China are different from foreign countries,as our genotypes include CRF07_BC(41.3%),CRF01_AE(32.7%),CRF08_BC(11.3%),subtype B(4.0%)and multiple CRFs and URFs,and conducting HIV molecular network The epidemiological characteristics of the clustered population in the molecular network are not clear,and there is no simple and easy model to diagnose the risk of HIV transmission.In this study,the molecular transmission network was modeled by combining the molecular transmission network with epidemiological factors,and the sequential information of individuals was fully utilized,and the included variables included the socio-demographic information and sexual behavior of the study subjects.Risk models were used to characterize the risk of MSM clustering,to understand factors associated with clustering,to predict the likelihood of clustering in infected individuals and to guide targeted interventions.Methods:1.Subjects: Newly diagnosed HIV-infected patients in Shenyang from 2016-2018.Plasma samples and baseline information were collected from patients before antiretroviral therapy.Questionnaires were used to investigate patient demographics(gender,age,educational status,etc.)and behavioral information(number of sexual partners,condom use status,etc.).The study was approved by the Ethics Committee of the First Hospital of China Medical University.2.HIV-1 pol gene sequence nested PCR amplification:HIV-1 positive plasma was taken and amplified according to QIA amp.The HIV-1 pol region gene fragment(HXB2:2352-3352)was amplified by nested PCR using 140 μl of plasma from infected patients according to the standard protocol of QIA amp RNA Mini Kit.3.Sequencing and sequence alignment comparison: The sequences measured were spliced back using the Contig Express component of the Vector NTI 8.0 software package.The sequences were aligned and edited manually using the HIV Database online tool HIV align.Sequences covering amino acids 1-99 of the protease region and amino acids 1-234 of the reverse transcriptase region with a length of 1000 bp(HXB2: 2253-3252)were retained.4.Viral sequence subtype identification and classification: Fast Tree 3.0 was used to construct a Maximum Likelihood phylogenetic Tree(ML Tree)with subtype N as the periphery,and the GTR+G+I nucleotide substitution model and the SH test(Shimodaira Hasegawa-like test(SH Test)was used to calculate the support value of the nodes of the evolutionary tree,and the support value >90 was used as the criterion to determine the viral genotype.5.Construction of molecular transmission network and transmission risk evaluation:Gene distance threshold sensitivity analysis was performed to obtain the gene distance thresholds of CRF01_AE and CRF07_BC subtypes with subtype B.The online tool HIV-TRACE(http://hivtrace.datamonkey.org/hivtrace)was applied to construct the molecular transmission network of the main prevalent strains CRF01_AE,CRF07_BC and subtype B in Shenyang.6.Collection of epidemiological data and transmission risk evaluation: A self-designed questionnaire was used,which included information on socio-demographic and sexual behavior characteristics.Survey respondents were identified using a unique number(Patient identification,PID).Questionnaire data collection,summarization and quality control were performed according to the study protocol.7.Establishment of Model: SPSS 26.0 was used to analyze the data.For the data,rates and composition ratios were used to describe,and the chi-square test was used to compare the distribution differences;the number of Links within molecular clusters(Link=2)was used as the dependent variable,and the regression was modeled by Logistic single-factor regression and multifactor analysis by Enter method.8.Evaluation of the model: The fitting effect of the model was evaluated by H-L test;the discrimination of the model was evaluated by using the statistics of the C-statistic test.The AUC was used to judge the discriminative ability of the model,and the diagnostic ratio ratio(DOR)was used to judge the discriminative effect of the model,etc.;Bootstrap resampling technique was used for internal validation of the model(Model Validation).Results:1.The composition of molecular transmission network of newly diagnosed HIV-infected patients in Shenyang from 2016-2018The main prevalent strains in Shenyang(2016-2018 data)were CRF01_AE(70.9%,1542/2174),CRF07_BC(18.1%,394/2174)with subtype B(4.5%,97/2174),and 13 subtypes and recombinant complexes.2.Composition of the molecular transmission network of HIV-infected patients in Shenyang region from 2016-2018Sensitivity analysis yielded an optimal gene distance threshold of 0.007 substitution rate/locus for CRF01_AE and CRF07_BC,and 0.013 substitution rate/locus for subtype B.A total of 239 molecular clusters were generated from 861 sequences out of 2032 sequences,with a cluster formation rate of 42.4%(861/2032)and a molecular cluster size range of 2 to 77 individuals.CRF01_AE,CRF07_BC and B subtypes had cluster formation rates of 40.4%(617/1541),49.2%(194/394)and 51.6%(50/97),respectively.3.Model building and evaluation3.1 Basic characteristics of modeling subjectsA total of 385 cases of risk questionnaires were completed.Among them,68.6%(264/385)were individuals within the molecular cluster and 31.4%(121/385)were individuals outside the molecular cluster.For the population participating in the questionnaire survey,85.2%(328/385)were MSM,and 328 MSM were analyzed.4.2.Risk factors for the MSM population transmission risk prediction model Factors associated with risk of transmission of MSM were: presence of male spousal/commercial sex partners(vs.no male spousal/commercial sex partners,OR=7.939,95%CI= 4.369~14.427,P<0.001),CD4 count≥350 个/ML(vs.CD4 count <350,OR=2.123,95%CI = 1.255~3.594,P = 0.005),≥5 HIV tests before diagnosis(<5 HIV tests before diagnosis,OR=2.03,95%CI=1.042~3.954,P=0.038),and no knowledge of partner infection status(knowledge of partner infection status,OR=1.709,95%CI=1.015~2.877,P=0.044).4.3.Evaluation and validation of the modelThe chi-square value of the H-L test was 4.6(P = 0.799 > 0.05),and the model fit was good;the model C-statistic test was 0.763,indicate good discrimination,and the AUC of risk model was 0.763,different probability values of the HIV risk transmission model for MSM population could represent different risk levels.The frequencies of the three variables of the prediction model in the 1000 resampled Bootstrap samples were 90.3%、61.5%、60.7%、54.1%,respectively,which were all greater than 50%,indicating that all three variables had good confidence.Conclusion:1.The main HIV prevalent subtypes in Shenyang are CRF01_AE,CRF07_BC,with B three HIV-1 subtypes,and there are also multiple CRF and recombinant strains.2.Sensitivity analysis was performed to derive the optimal genetic distance thresholds for CRF01_AE,CRF07_BC and B subtypes and to construct a molecular transmission network.3.The risk factors included in the MSM population transmission risk prediction model:having male casual/commercial sex partners,CD4 count≥350 per/ml,number of HIV tests≥5 before confirming positive diagnosis,and not knowing the infection status of sex partners.4.The HIV molecular transmission network-guided MSM transmission risk model has good predictive power. |