| Objective: Cardiovascular disease(CVD)is the leading cause of death worldwide and a main reason of disability.Although the technology of cardiovascular disease treatment has developed rapidly in our country,the incidence and mortality of CVD are still increasing in the past 30 years.The prevention and control effect of cardiovascular diseases is not optimistic.It is necessary to explore cardiovascular disease prevention and control programs suitable for Chinese national conditions.In 2010,the American Heart Association(AHA)developed a simple seven-point tool to measure ideal cardiovascular health(CVH)score,which was used to shift attention from reducing CVD incidence at the national level to improving cardiovascular health in the population.It includes four health behaviors(smoking status,Body mass index(BMI),physical activity,and diet)and three health factors(blood pressure,fasting blood glucose,and total cholesterol).This primary precaution is equally applicable in our country.Identifying at-risk individuals is also critical for the primary prevention of CVD.However,the Prediction model based on Western populations is not suitable for Chinese populations.A recently developed CVD prediction model for Atherosclerotic cardiovascular disease in China called“China-PAR model”.More data are needed to validate the Prediction for Atherosclerotic cardiovascular disease Risk in northeast rural China.However,with the advancement of computer technology,machine learning(ML)provides a new method for individualized risk assessment,which introduces variables that cannot be considered in traditional regression algorithms(such as ECG and cardiac ultrasound indicators)into the risk prediction model by exploring the multidimensional nonlinear relationship between potential variables.On the other hand,in-depth understanding of the pathogenesis and mechanism of CVD is helpful for the prevention and control of CVD.More and more studies have shown that the risk of cardiovascular disease is related to the combination of genetic factors and environmental exposure.With the development of high-throughput technology,metabolomics has been increasingly applied in the field of cardiovascular disease and the relationship between environmental chemical exposure and CVD has also been studied to some extent.However,many studies only focus on a few or specific categories of exogenous chemical residues and there is a lack of effective markers that can accurately and comprehensively reflect the influence of exogenous chemical residues.If one or a group of sensitive and specific exposure and metabolic markers can be found in the population to predict CVD,it will certainly have a huge impact on the primary prevention of CVD.Therefore,this study first used data from prospective cohort studies to explore how cardiovascular health changes over time and whether these changes are related to CVD,and then developed ML-based risk prediction models(integrating demographic,behavioral,psychological,ECG,and echocardiographic variables)from the data set.To predict the CVD risk of the general population in Northeast China,compare the performance of ML algorithm with the traditional Cox regression model,and then further take cardiovascular disease as the end event.A nested case-control study was conducted to analyze the relationship between serum exogenous chemical residues and metabolites and cardiovascular diseases,and to further reveal the exogenous chemical residue exposure and metabolite predictors associated with cardiovascular diseases.Methods: Part I: A prospective cohort Study in the Northeast China Rural Cardiovascular Health Study(NCRCHS)was conducted with cardiovascular health examinations from 2012/2013(baseline)to 2015/2017(follow-up).Cardiovascular events were followed up and CVH was calculated using seven measures(smoking,body mass index,total cholesterol,blood sugar,blood pressure,physical activity,and diet),each of which was classified as poor,moderate,and ideal.Participants meeting the ideal criteria of 0 to 2,3 to 4,and 5 to 7 were classified as low,medium,and high cardiovascular health,respectively.There were 7,466 participants aged between 35 and 85 were included in the study.The hazard ratio(HR)and 95% confidence interval(CI)for three cardiovascular health conditions were calculated using Cox proportional risk models to assess changes in CVH status at baseline and follow-up.The relationship between 7 CVH changes and the incidence of CVD was calculated by binary logistic.Part Ⅱ: A mixed dataset of 551 features was used,including 98 demographic,behavioral,and psychological features,444 Electrocardiogram(ECG)features,and 9echocardiography(Echo)features.Seven machine learning(ML)based models were trained,validated and tested after selecting the 30 most informative features.We compared the ML model’s identification,calibration,Net reclassification improvement(NRI)index with traditional ASCVD risk calculators(PCE and China-PAR).Part Ⅲ: A total of 154 cases with incident CVD were randomly selected from a prospective cohort in northeast rural China,controls without CVD were randomly selected and matched 1:1 according to the propensity score matching strategy from the same cohort with similar baseline characteristics and follow-up time.Targeted and non-targeted spectrometry based on Liquid chromatography-Mass spectrometry(LC-MS)were used to determine the quantitative and metabolic spectrometry of exogenous chemical residues from 304 serum samples,respectively.Non-parametric test,correlation analysis and Cox regression analysis were used to explore the relationship between metabolites and exogenous chemical residues and cardiovascular diseases and related metabolic pathways,and the CVD-MP score was constructed using metabolites/chemical exposures that have an important impact on the occurrence of cardiovascular events for risk prediction.Then,the Receiver operating characteristic curve(ROC)was used to evaluate the model performance.Results: Part I: Moderate cardiovascular fitness(HR,0.66[95%CI,0.52 to 0.83])and high cardiovascular fitness(HR,0.37[95%CI,0.2-0.68])were associated with a lower risk of CVD compared to lower cardiovascular fitness at baseline.During the 4 years from baseline to follow-up,24.4% of participants had improved cardiovascular health,18.1% had deteriorated cardiovascular health,and stable moderate(24.9%)and low(27.6%)cardiovascular health were more prevalent among participants.In multivariate analysis,the risk of cardiovascular disease was in the low to medium/high group(HR,0.71[95%CI: 0.53 to 0.96]),moderate to low group(HR,0.55[95%CI,0.37 to 0.80]),sustained moderate group(HR,0.58[95%CI;0.43-0.78]),the medium-high group(HR,0.33[95%CI(0.16-0.68)],the high-to-low group(HR,0.41[95%CI,0.20-0.81]),and the persistently high group(HR,0.12[95%CI,0.03-0.50])were all lower compared to the consistently low group(reference).Part Ⅱ: The study included 9609 participants(mean age 53.4±10.4 years,53.7%women)and 431(4.5%)participants developed ASCVD during a median follow-up of 4.7 years.In the test set,the ML-based Artificial Neural Network(ANN)model outperformed PCE,China-PAR,recalibrated PCE and recalibrated China-PAR in predicting ASCVD.This model has a higher Area under the curve(AUC)of 0.800 compared to other models of 0.777,0.780,0.779 and 0.779.In addition,the Hosmer-lemeshow χ2 of this model is lower at 9.1 compared to 37.3,67.6,126.6 and18.6 of other models.The ML-based ANN model also has a higher net gain of 0.017 at a threshold of 5% compared to 0.016,0.013,0.017,and 0.016 for other models.In addition,the NRI of the ML-based ANN model was 0.089,while the NRI of the PCE,China-PAR and recalibrated PCE were 0.355,0.098 and 0.088,respectively.Part Ⅲ: Metabolome analysis included 311 identified metabolites with a total of3754 characteristic peaks detected.16 metabolites that contributed to the grouping were found by multivariate analysis.By nonparametric test,28 metabolites with significant differences between the two groups were found.These 43 important metabolites were related to α-linolenic acid metabolism,glycerol phospholipid metabolism and unsaturated fatty acid metabolism.Two metabolites,PC34:4 and PCO-36:4,which have important effects on the occurrence of Cardiovascular events were selected through Cox analysis and the Cardiovascular disease metabolite panel(CVD-MP)score was constructed for risk prediction.In the exposure group,13 metabolites were detected with high frequency,mainly perfluorinated compounds.In the exposure group,13 metabolites were detected with high frequency,mainly perfluorinated compounds.Cox analysis was used to screen out the risk exposure of Perfluorooctanesulfonate(PFOS)for cardiovascular events and the risk prediction model was constructed.The performance of this model is better than that of CVD-MP model and PFOS combined CVD-MP model.The cardiovascular event curve of the high-risk group based on the model score was higher than that of the medium risk group and the low risk group,with statistical significance.The 3-year and 5-year time-dependent AUC based on the PFOS model score were 0.966 and 0.879,respectively,which were better than the CVD-MP model and the combined CVD-MP model of PFOS.Conclusions: Our large sample size survey of rural Northeast China suggests that persistent ideal CVH status and initially ideal CVH are associated with a lower risk of CVD events and all-cause mortality.Improvements in CVH over time were also associated with a lower risk of CVD events and lower all-cause mortality.Compared to traditional regression ASCVD risk models such as PCE and China-PAR,ANN predictive models can help optimize the identification of individuals with elevated cardiovascular risk by flexibly incorporating a wider range of potential predictors.These findings may help guide clinical decision-making and ultimately aid in the prevention and management of ASCVD.A variety of metabolites and metabolic pathways are involved in the pathogenesis of CVD.Environmental chemical exposure PFOS is an independent risk factor for CVD,and the prediction model of CVD established by PFOS has the best prediction performance. |