Font Size: a A A

A New Pipeline For Targeted Profiling Of Short Tandem Repeats In Massively Parallel Sequencing Data

Posted on:2021-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2480306503465644Subject:Biology
Abstract/Summary:PDF Full Text Request
Short tandem repeats(STRs)are important polymorphism makers for human identification and kinship analyses in forensic science.With the continuous development of massively parallel sequencing(MPS),more laboratories have utilized this technology for forensic applications.Existing STR genotyping tools,mostly developed for whole-genome sequencing data,are not effective for MPS data.More importantly,their backward compatibility with the conventional capillary electrophoresis(CE)technology has not been evaluated and guaranteed.In this study,we developed an end-to-end pipeline called STRsearch for STR-MPS data analysis.The STRsearch can not only determine the allele by counting repeat patterns and INDELs that are actually in the STR region,but it also translates MPS results into standard STR nomenclature(numbers and letters).We evaluated the performance of STRsearch in two forensic sequencing datasets,and the concordance with CE genotypes was75.73% and 75.75%,increasing 12.32% and 9.05% than the existing tool named STRScan,respectively.Additionally,we trained a base classifier using sequence properties and used it to predict the probability of correct genotyping at a given locus,resulting in the highest accuracy of 96.13%.All these results demonstrated that STRsearch was a better tool to protect the backward compatibility with CE for the targeted STR profiling in MPS data.STRsearch is available as open-source software at https://github.com/An Jingwd/STRsearch.
Keywords/Search Tags:short tandem repeats, massively parallel sequencing, STR genotyping, validation studies, forensic sequencing
PDF Full Text Request
Related items