| Objective:To evaluate the multi-dimensional factors affecting Physical activity-related Injury(PARI)among college students,it is essential to measure all the psychological,environmental and molecular biological conditions,analyze the correlation between SNPs,Gut Microbiota and multiple PARIs,explore the molecular mechanisms of multiple PARIs,identify the key Gut Microbiota and their associated pathways,select potential biomarkers of intestinal flora for multiple PARIs,and establish multi-level prediction models for PARI,which provides the scientific evidence for early screening and intervention for those college students at risk of PARI.Methods:Delphi method and Analytic hierarchy process method were used to construct a physical activity environment assessment system,assessing the physical activity environment in the universities.By using cluster random sampling,college students who entered from 2017 to 2019 in Shantou and Linfen were selected to identify the prevalence of PARI through a cross-sectional survey.A nested case-control study on the biological,psychological,environmental and social multi-dimensional influencing factors was conducted in the college students who entered in 2019 and participated in the cross-sectional study,and a one-year follow-up investigation was conducted to screen the participants for a 1:1 paired case-control study,in which the blood and Gut Microbiota biological samples of the participants were tested.Candidate SNPs were detected in the biological samples using the Mass ARRAY SNP typing assay and the composition of the intestinal flora in the faecal samples was determined using 16 S r DNA amplicon sequencing.Statistical descriptions of the qualitative data and quantitative data were performed using means(standard deviations)and frequencies(percentages),respectively,and differences between groups were calculated by two independent sample t-tests and chi-square tests,respectively.Hardy-Weinberg equilibrium for genotype and allele frequency distribution was analysed using the chi-square-test.The relationship between alleles,genotypes and their dominant and recessive models and multiple PARIs was explored by logistic regression models.In addition,the generalized multi-factor dimension reduction method was used to analyze the interaction between the factors.The role of Gut Microbiota on PARIs was elucidated by searching for differential flora and their associated pathways of action through α-diversity analysis,β-diversity analysis,taxonomic composition analysis,LEf Se intergroup variation analysis and PICRUSt flora metabolic function prediction,and the interaction between Gut Microbiota and different factors was analysed in conjunction with epidemiological surveys.Combined with the previous survey data,the Lasso feature selection method was used to screen variables,and the logistic regression,support vector machine,random forest,limit gradient boosting,and artificial neural network model were used to construct a hierarchical prediction model of PARI for college students.Epi Data Entry was used for double entry,consistency testing and logistic error detection;SPSS 23.0,Yaanp 2.5,PLINK1.07,Haploview 4.0,FLASH 1.2.7,Uparse 7.0,Qiime 1.9.1,GMDR07,R 4.0.2 and Python 3.7 were used for statistical analysis.All analyses with two-tailed P<0.05 were considered statistically significant.Results:1.Among the main factors of physical activity environment on college campuses,the material environment had the highest weight of 0.466,followed by the campus cultural environment with a weight of 0.233,the social-psychological environment with a weight of0.172,and the less important natural ecological environment with a weight of 0.129.In terms of the physical environment,the highest weight of the site facility safety index was 0.190.In the natural ecological environment,the highest weight of climate conditions was 0.040.In the campus cultural environment,the highest weight of educational method and content index was0.043.Campus psychology had the highest weight in the social-psychological environment,which occupied the highest weight of 0.035.2.The cross-sectional study showed that the incidence of PARI among college students was22.5%(937/4161),of which 21.7% had multiple PARIs.The incidence of PARI among college students t in Shantou(29.8%,506/1698)was higher among which in Linfen 17.5%(431/2463).The incidence of PARI was higher in boys(26.1%,423/1622)than in girls(20.3%,514/2539).Shantou area(OR=2.952),boys(OR=1.304),freshmen(OR=1.147),sports team members(OR=3.143)and those with chronic diseases/symptoms(OR=1.541)had a higher rate of PARI.Insufficient sleep(OR=1.418-1.425),and long use of electronic devices(OR=1.325)also increased the risk of PARI.The nested case-control study found that a total of 388(male: 228/640,female: 160/704)students had experienced PARI in the past twelve months,and the cumulative number of occurrences was 528(male: 302,female: 226).The risk of PARI was 0.39 times/person/year.The risk of PARI among boys was higher than among girls(0.47 vs.0.32/person/year,respectively;P<0.001).Male students(OR=1.346),sports team members(OR=2.455),antibiotic use(OR=1.848),regular yogurt consumption(OR=0.579),and irritable bowel status(OR=1.485)also had an impact on the occurrence of PARI.And those with negative PA behaviour(OR=1.470)and stimulus seeking(OR=1.462)were more likely to experience PARI.3.The rs925946 of the BDNF gene and rs4355801 of the OPG gene were associated with the occurrence of multiple PARIs.rs925946-GG,rs4355801-GA and rs4355801-GG reduced the risk of multiple PARIs(OR=0.128,0.420,0.142).Haplotype analysis yielded a higher risk of multiple PARIs in T-T-C(OR=2.064,P=0.017)in the haplotype consisting of three loci,rs10858286,rs12722 and rs3128575,of the COL5A1 gene.rs12722 within the COL5A1 gene rs12722 interacted with rs4818 within the COMT gene.When the genotype of rs4818 was TC and the genotype of rs12722 was CC,the risk of multiple PARIs was the highest(OR=7.500,P=0.049).Sports team members,negative PA behaviors interacted with different gene loci and affected the occurrence of multiple PARIs.4.There were no significant differences in species richness and diversity of Gut Microbiota between the PARIs and healthy control groups,and the differences in both species structure and composition were greater between than within groups(R= 0.025,P = 0.024).At the phylum level,the Phylum Proteus was significantly enriched in the PARIs group compared to healthy controls(35.0% vs 25.7%,P<0.05).Proteobacteria were enriched in the healthy control group(30.4% vs 20.4%,P <0.05).At the genus level,the most dominant genera Escherichia coli-Shigella,Pseudomonas,Megamonas,Citrobacter and Comamonas were enriched in the faeces of healthy controls,while Bacteroides,Prevotella_9,Bifidobacterium,Prevotella and Faecalibacterium were enriched in the faeces of the PARIs.LEf Se analysis revealed that the relative abundance of Bacteroidetes was significantly higher in the PARIs group than in the healthy controls,the abundance of Bacteroidetes and Parabacteroides were significantly more abundant in the PARIs group,while Megamonas was significantly more abundant in the healthy controls.5.In the construction of the primary prediction model for PARI in college students,the prediction accuracy of the five models has little difference.The AUC of the LR model(ROC =0.670)was higher than that of SVM and DNN,and there was no statistical difference compared with RF and XSGBoost.However,the sensitivity of the LR model(Sensitivity=0.628)was relatively optimal.In the intermediate prediction model construction,combining the sensitivity and positive predictive values of the four models,the LR model performed the most balanced,and the LR model(ROC=0.67)was selected as the best model.In the advanced prediction model construction,the DNN model had the highest accuracy(Accuracy=0.711),but the sensitivity and positive predictive values of this model were relatively the lowest.The XGBoost model had the highest AUC(ROC=0.679)and sensitivity(Sensitivity=0.846),and the XGBoost model was chosen as the best model.Among the personalized prediction models,the random forest model showed good predictive performance.The use of different levels of prediction models according to the screening environment can improve the efficiency of predicting the risk of PARI occurrence among college students,thus effectively identifying the high-risk group of PARI at an early stage and providing a scientific basis for the timely formulation of effective preventive measures.Conclusions:1.A physical activity environment assessment system for colleges was constructed,which contained 4 first-level indicators and 27 secondary-level indicators,which mainly include four aspects: physical environment,natural ecological environment,campus culture and psychosocial environment.It could be used to evaluate the safety level of physical activity in colleges,calculate its overall score and the scores of indicators at each level,and quantitatively find the weak links in the physical activity environment of colleges.2.The incidence of PARI in college students was relatively high,which was affected by biological,psychological,social and environmental factors.Targeted and multi-faceted effective intervention strategies should be adopted to prevent and reduce the occurrence of PARI.3.The rs925946 of BDNF gene and rs4355801 of OPG gene could affect the risk of multiple PARIs,and the interaction between genes and genes,genes and environment could affect the occurrence of multiple PARIs.4.The study found that the composition of gut microbiota of college students in the PARIs group was different from that in the healthy control group.The increased abundance of Bacteroidetes and Parabacteroides,and the decreased abundance of Megamonas may be the hallmark microbiota characteristics of PARIs.5.The PARI prediction models constructed based on machine learning had the predictive ability.The logistic regression prediction model among the primary and intermediate prediction models had a better prediction effect on the occurrence of PARIs among college students.The XGBoost model among the advanced prediction models had a better prediction effect on the occurrence of PARIs among college students.And the random forest model among the personalized prediction models showed good predictive performance. |