Font Size: a A A

Studied On Gene Sequence Alignment Based On Mixed Suffix Tree And Suffix Array

Posted on:2016-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y JiaoFull Text:PDF
GTID:2180330464963996Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Along with the rapid developing of gene sequencing technology,For gene data the most basic but the most important analysis method is sequence alignment. Through study the commonly used analysis algorithms and analysis software and compare the type of sequence features and functions, etc. The significant point is data structure for effective organize the gene data. What the helpful data structure is hash table, suffixes tree and suffix array. In this paper, based on the study of the existing common sequence alignment algorithm, putting forward a new sequence alignment algorithm named BWL There are some feathers about BWL:(1) BWL exploits a new data structure that mixed suffix trees and suffix arrays together. The new structure reduces the memory footprint, and resolves the problem of the bigger and bigger gene data file are limited by memory. What’s more, it reduces the frequent of access to external memory, and the running speed is improved.(2) BWL introduces the space seed model to increase the sensitivity. For inexactly matched sequence. by adding space in sequence is faster than standard dynamic programming method and indcls by-bit. The effectiveness of algorithm has obvious promotion.Finally, this paper designs simulation experiments of sequence alignment, and analyses the result in detail. The experimental results show that the algorithm of mixed suffix trees and suffix arrays occupy fixed size of memory. What is more, the speed of building index is faster. In another size, the proposed space seed model is valid, and the improvement of actual performance is significant.
Keywords/Search Tags:Sequence Alignment, Suffix Tre, Suffix Arrays, BWT, FM-index, Space Seed
PDF Full Text Request
Related items