| Background: Forensic DNA analysis has the characteristics of science and objectivity,and occupies an important position in forensic science.At present,the common method for forensic DNA analysis is based on the CE platform to detect STR genetic markers,which has the advantages of high sensitivity,high specificity and high resolution.However,STR typing cannot be used to personal characterization and remains limited in detecting semen-containing mixture samples.DNA methylation is the most stable and widely studied marker in epigenetics and gradually being valued by forensic scientists.The methylation status of some Cp G sites in the human genome is affected by various internal and external environmental factors,such as: age,nutrition,smoking,drinking,early life social environment,physical activity,air pollution,drugs,poisons,etc.In addition,cell type-specific DNA methylation patterns exist in different cell types,mainly related to cell type-specific functions.Therefore,DNA methylation may provide a variety of information such as age,tissue origin,smoking status,alcohol consumption,ancestry,BMI,and DNA authenticity of forensic biological samples,thus becoming a new forensic genetic marker to compensate for the shortcomings of STR typing technology.Objective: In order to compensate for the shortcomings of STR typing technology in personal characterization,this study intends to use MsSnu PE technology to detect methylation levels of Cp G sites closely related to smoking in blood samples to construct a smoking status prediction model for Chinese population,which is helpful for the donor characterization of unknown blood samples.Meanwhile,in order to compensate for the shortcomings of STR typing technology in detecting semen-containing mixture samples,this study intends to combine semenspecific Cp G sites with nearby tightly linked microhaplotype sites to define a novel composite genetic marker(methylation-microhaplotype),and detect them using MPS technology,which can help identify the presence of semen and obtain the genotypes of semen-specific DNA in semencontaining mixture samples.Methods: 1.Appliaction of DNA methylation genetic markers in smoking status prediction: In this study,blood samples from 204 unrelated individuals of Han Chinese were collected,and Cp G sites closely related to smoking were selected according to the corresponding screening criteria.Then,methylation levels of candidate Cp G sites in blood samples were detected using Ms-Snu PE technology.Finally,according to the detection results,the correlation between the methylation levels of candidate Cp G sites and smoking status were analyzed,and a variety of smoking status prediction models were constructed and compared,so as to determine the most recommended smoking status prediction model for Chinese population.2.Application of DNA methylation genetic markers in mixture analysis: In this study,a novel composite genetic marker(methylationmicrohaplotype)was defined by combining semen-specific Cp G sites that met the screening criteria with nearby tightly linked microhaplotype sites,and it was detected using MPS technology.The body fluid specificity of candidate methylation-microhaplotype loci was then investigated in semen,vaginal fluid,blood and saliva samples,as well as the genetic parameters of candidate methylation-microhaplotype loci were investigated in 61 unrelated individuals.Finally,these candidate methylationmicrohaplotype loci were analyzed in semen-containing mixture samples with different mixing ratios to comfirm the ability to identify the presence of semen and obtain the genotypes of semen-specific DNA.Results: 1.Appliaction of DNA methylation genetic markers in smoking status prediction: In this study,9 Cp G sites closely related to smoking were successfully screened according to the corresponding screening criteria,and the methylation level of candidate Cp G sites in blood samples was successfully detected by Ms-Snu PE technology.However,SBE primers at sites cg12803068 and cg21566642 contained other Cp G sites that may introduce bias.Therefore,only other 7 Cp G sites were included in the subsequent statistical analysis.After constructing the LR model based on each of the seven Cp G sites,it was found that the LR model based on the site cg05575921 had the best smoking status prediction ability compared with the LR models based on the other six Cp G sites.After constructing the combined MLR model and the stepwise MLR model based on the combined Cp G sites,it was found that the combined MLR model and the stepwise MLR model had good performance in predicting smoking status,and were better than the LR model constructed based on each of the seven Cp G sites.However,the accuracy,specificity,and AUC of the stepwise MLR model in the testing dataset were slightly higher than those of the combined MLR model,and the stepwise MLR model required only a small amount of site information.Therefore,the stepwise MLR model based on two significant Cp G sites was more suitable for predicting smoking status in Chinese population(AUC up to 0.86).2.Application of DNA methylation genetic markers in mixture analysis: In this study,20 methylated-microhaplotype loci were selected based on screening criteria.Except for loci MMH01ZHA010,MMH01ZHA011,MMH02ZHA006,MMH03ZHA001,MMH04ZHA008,MMH10ZHA006,MMH11ZHA005 and MMH15ZHA001,the other 12 methylated-microhaplotype loci were amplified and sequenced in 61 unrelated individual samples,2 replicate samples and 13 semen-containing mixture samples based on MPS technology.Of the 12 methylation-microhaplotype loci successfully detected,2 loci were semen-specific hypermethylated loci and the other 10 loci were semen-specific hypomethylated loci,which had good body fluid specificity and were helpful for body fluid identification.The 12 methylation-microhaplotype loci successfully detected had an average Ae of 2.96,which had good polymorphisms and were helpful for individual identification.Finally,by analyzing semen-containing mixture samples with different mixing ratios,it was found that 12 semen-specific hypermethylated or hypomethylated loci had good ability to identify the presence of semen,and also had good ability to obtain the genotypes of semen-specific DNA in semen-containing mixture samples.In addition,12 semen-specific hypermethylated or hypomethylated loci can also be used to identify the presence of semen and simultaneously obtain the genotypes of semen-specific DNA when two donors of semen-containing mixtures were unknown.However,when interpreting the results of semencontaining mixture samples,it was important to pay attention to factors such as allele dropout and reasonable threshold determination.Conclusion: In this study,based on smoking-related Cp G sites and Ms-Snu PE technology,a smoking status prediction method for Chinese population is constructed,which has the advantages of simplicity,economy and application in conventional forensic science laboratories,and it can assess the smoking status of unknown blood samples,which helps to the donor characterization of unknown blood samples,thereby compensating for the shortcomings of STR typing technology in personal characterization.At the same time,a novel composite genetic marker(methylationmicrohaplotype)is defined based on semen-specific Cp G sites and nearby tightly linked microhaplotype sites,and it is detected using MPS technology,which is helpful to identify the presence of semen and obtain the genotypes of semen-specific DNA in semen-containing mixture samples containing a small amount of semen or without intact sperm cells,thus compensating for the shortcomings of STR typing technology in detecting semen-containing mixture samples. |