Font Size: a A A

Genomic Structural Variants Fragments Detection Algorithm Based On Sequence Alignment Skeleton

Posted on:2020-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:J H SuFull Text:PDF
GTID:2370330590974462Subject:Computer science and technology
Abstract/Summary:PDF Full Text Request
Genomic structural variation is a genetic variation that exists in a genome with multiple variant types.The genomic structural variation affects traits such as phenotypic characteristics and disease development of organisms.Because of the limitations of genome sequencing technology and a large number of repetitive regions in the genome,detecting and analyzing genomic structural variations remains a difficult task.The third-generation genome sequencing technology can detect sequencing sequences with an average length of 10 kbp.These sequencing sequences can cover the genomic structural variation regions,so they can be used to detect structural variation information of the genome.Genomic sequencing analysis is increasingly becoming a necessary technology to achieve precision medicine and promote healthy human development.Detection of genomic variation in genome sequencing data has become a popular topic in the field of bioinformatics research communities.In order to promote the research of genomic structural variation and thirdgeneration sequencing sequence analysis process,this paper investigates the existing third-generation sequencing sequence analysis methods and structural mutation detection algorithm in the sequence processing process.We point out the problems existing in the current algorithm and propose a structural variants fragments detection algorithm based on sequence alignment skeleton.The main research results of this paper are as follows:(1)By analyzing the sequence processing process,it is found that the current process of structural variation detection needs to complete the sequence alignment process first,and then analyze the sequence alignment breakpoint from the sequence alignment result to detect the variation of the result.To break this situation,this paper proposes a method to bypass the complete sequence alignment analysis process and directly analyze the breakpoints from the sequence data and then analyze the structural variation.(2)We establishing a de Bruijn mapping index from the sequencing data for mapping sub-sequence.A directed acyclic graph is established for the seeding process.We apply the sparse dynamic programming algorithm on the graph structure to detect the linear relationship between the seeds to establish the original skeleton of the sequence alignment.(3)Designing a sequence block extension algorithm to obtain a sequence alignment skeleton,and treating the gap existing between the sequence alignment skeletons as sequence alignment breakpoints,for detecting the genomic structural variation.(4)We test the algorithms in real human third-generation sequencing data.The experiment shows that the proposed algorithm performs well in human thirdgeneration sequencing data.In this paper,we propose a structural variant fragments detection algorithm based on sequence alignment skeleton,which can detect genomic structural variation fragments without obtaining sequence alignment information.The algorithm proposed in this paper has a speed advantage when performing genomic structural variation detection.This research result can provide insight to other genomic structural variation detection and other sequence analysis processes.
Keywords/Search Tags:Genomic structural variation, third generation sequencing data, sequence alignment skeleton, de Bruijn graph index, sparse dynamic programming algorithm
PDF Full Text Request
Related items