Font Size: a A A

Forensic Evaluation Of Mitochondrial DNA Whole Genome Sequencing System And Research Of Complex Kinship Identification Combined With Intranuclear Molecular Markers

Posted on:2024-06-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:M ChenFull Text:PDF
GTID:1524306926479854Subject:Forensic medicine
Abstract/Summary:PDF Full Text Request
Background and aims:Mitochondrial DNA(mtDNA)has important scientific and applied value for forensic kinship identification,maternal family tracing,population historical dynamics exploring and evolutionary evolution inference due to its characteristics of high copy number,ring structure against degradation,maternal inheritance and high mutation rate.The analysis of mtDNA mainly includes four dimensions:mtDNA single nucleotide polymorphisms(mtDNA-SNP),hypervariable region(HVR),control region and the mtDNA whole genome,among which the whole genome is undoubtedly the most fully used method of mtDNA maternal genetic information.The whole mtDNA genome sequence is mainly detected through massively parallel sequencing.Researchers have achieved the full coverage of mtDNA whole genome 16569 bp sequence through the library preparation strategies of short segment(<200 bp),medium segment(300-500 bp)and long segment amplification(>4 kb),respectively.However,due to the existence of mtDNA homologous sequences in the nuclear genome DNA,the probability of nuclear DNA mismatch is higher for short and medium segment strategies,which is manifested as the missing of the target mtDNA loci or false positive results.The identification of complex kinship plays an important role in the investigation of missing persons,the confirmation of victims in accidents and disasters,anti-trafficking of stolen children and women,and the tracing for relatives.At present,the main ways to increase the forensic efficiency for complex kinship identification are through increasing the number of genetic markers,combination analysis of multiple types of genetic markers,genealogical analysis,etc.However,new genetic markers are far from being found to be applied,and genealogical analysis requires a huge prior database for searching.Additionally,the X chromosomal short tandem repeat(X-STR),Y chromosomal STR(Y-STR)and mtDNA are mainly used as qualitative analysis tools for exclude or support conclutions,which are lack of quantitative judgment methods and algorithms.It might be a more scientific and reasonable strategy to apply the existing common nuclear DNA genetic markers(STR and SNP)and jointly analyze the extranuclear genetic material mtDNA sequence,which could give fully application to the characteristics and advantages of the intranuclear,extranuclear,maternal and paternal genetic markers,and then improve the efficiency of complex kinship identification.Methods and contents:(1)Using the long fragment amplification strategy and applying the domestic BGI DNB sequencing platform,the mtDNA whole genome sequencing system was constructed,the sequencing performance and forensic application efficiency of the system were evaluated.The sequencing performance includes the integrated process of raw data quality control,reference sequence mapping and splicing,and the mtDNA mutation site calling.In the aspect of forensic application efficiency,we evaluated the sensitivity of the detection system and the detection efficiencies of mixed samples,different tissue and body fluid original samples,degraded samples(natural degradation and physical degradation)and family samples.Additionally,we explored the maternal genetic structure and genetic background of Han population with the mtDNA whole genome sequence obtained in our study.(2)Based on the sequence data of various types of genetic markers obtained from the domestic DNB sequencing platform,we analyzed the sequencing performance of various genetic markers on the corresponding platform.Analyzed the genetic polymorphisms of 54 A-STR、27X-STR、48 Y-STR,214 SNPs and 3 HVR regions of mtDNA,and evaluated the forensic parameters and paternity identification performances of various genetic markers in depth from both dimensions of length and sequence polymorphism.(3)Based on the obtained sequence and length polymorphism information of STR,SNP and mtDNA in the Han population,we simulated multiplex genetic data of family pairs with 32 different genetic relationships at the first,second and third degree kinships,and applied the discriminant analysis,likelihood ratio(LR)and random forest algorithm model to explore the identification efficiency of different kinships and among different relationships with the strategies of increasing the number of genetic markers and the combinations of different types genetic markers.Results and conclusions:(1)In positive control DNA sample and blood sample with FTA card,the average base quality values Q20 and Q30 of the sequence obtained by the forensic mtDNA whole-genome detection system were 97.35%and 92.41,respectively.The average alignment proportions of reference mtDNA sequence and nuclear DNA sequence were 99.47%and 0.53%,respectively.The coverage of the mtDNA whole genome sequence and 37 functional regions were all 100%.The average recognition accuracies of the four bases A,T,C and G were all larger than 99.78%.The system could classify more than 10%heteroplasmy.Reliable and accurate results can be obtained by the inputting 32.25pg genome DNA(gDNA)samples and the mixed samples,degraded samples(blood samples stored with FTA card for 11 years),different tissue and body fluid samples and family samples also could be well detected.In a word,the forensic mtDNA whole-genome detection system had a very good forensic performance with high data quality,high sensitivity,accurate results,suitable for degradation samples and maternal family tracing.(2)A total of 984 length polymorphism alleles and 1540 sequence polymorphism alleles were observed in 129 STR loci,of which autosomal STR(A-STR)was the highest sequence polymorphism marker,followed by X-STR and Y-STR.For 54 A-STRs,the combined exclusion power(CPE)of the first 9 loci of sequence polymorphism could reach 0.0000000001,which was equivalent to the performance of the first 16 loci of length polymorphism one.It could be seen that the acquisition of sequence information could significantly improve the efficiency of individual discrimination and paternity identification.The joint analysis of multiple types of genetic markers based on sequence polymorphism might have a strong potential for complex kinship identification.(3)The acquisition of sequence polymorphism can actually improve the genetic system’s efficiency of complex kinship identification.The true positive rates of the 245 autosomal markers(A-markers)system for the first,second and third-degree kinships identification are 99.5%,82.5%and 68.8%,99.6%,86.4%and 68.5%with length and sequence polymorphism,respectively.When the number of genetic markers increased from 20 to 245,the identification efficiencies of the first,second and third-degree kinships increased from 0.855 to 0.9948,0.8303 to 0.8833,and 0.4582 to 0.7773,respectively.Multiple types of genetic marker combinations can significantly improve the efficiencies of different degrees of kinship testing and identified different relatives with the same degree,among which 245 A-markers have identification accuracy rates of 100%(LR threshold:10000),86.33(LR threshold:10000)and 39.75%(LR threshold:100)for first,second and third-degree kinships,respectively.Combined analysis of mtDNA,X-STR and Y-STR there are 91.45%relative in first-degree,85.97%relative in second-degree and 60.40%relative in third-degree could be identified from the other relatives of the same degree.
Keywords/Search Tags:MtDNA whole genome, Massively parallel sequencin, Heteroplasmy analyzing, Degradation sample detection, Sequence-based polymorphism, Complex kinship testing
PDF Full Text Request
Related items