Font Size: a A A

DNA Multiple Sequence Alignment And Similarity Analysis Based On Pattern Matching

Posted on:2012-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2230330395984922Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the smooth execution for the Human Genome Project (HGP) and the rapid development of the information technology, a vast amount of molecular sequence data has come into existence. To make an effective scientific analyzing and processing of these alignment data so that they can play a great role in the diagnosis and treatment of human diseases, the prevention of tremendous epidemic diseases, and the development of new medicines has become a hot topic of conversation in common people’s researches, and it is also an important research subject in the bioinformatics. The bioinformatics is a new interdisciplinary science of combined displines. How to make an effective and fast alignment of gene sequence and conduct similarity analysis and evolution relationship analysis on the basis of it is one of the hot topics in bioinformatics.The major tasks of this thesis are to put forward a new multiple alignment sequence algorithm——a DNA way based on pattern matching and to conduct similarity analysis of gene sequence on the basis of this algorithm. The specific tasks are briefly summarized as follows:The multiple sequence alignment is a basic issue in bioinformatics. On the theoretical basis of pattern matching and Aho-Corasick searching algorithm, this thesis makes a deep analysis and study of DNA multiple sequence alignment algorithm which is based on keyword tree, puts forward a new multiple sequence alignment algorithm——a DNA way based on pattern matching. This algorithm has been analyzed in experiments and compared with center star alighment algorithm and DNA multiple sequence alignment algorithm which is based on keyword tree. When the level of sequcence similarity is relatively low, the alighment result is superior to the DNA multiple sequence alignment algorithm based on keyword tree with the weakness of the time occupied is a little more. When the sequnces with high level of similarites are aligned, the alignment time complexity is also superor to the other two methods. The results of experiments testified the effectivity of this algorithm.The analysis of sequence similarity is also one of the basic issues in bioinformatics. The result of analysis can be widely used in species classification, the prediction of structures and functions and species evolution analysis. This thesis puts the pattern matching method into the sequence similarity analysis, and uses the sequence alignment algorithm which is based on pattern matching to make sequence alighment. The results of alignment are used to construct the evolution tree in Kimura double parameter pattern and Neighbor-joining way. The results of experiments testified that the algorithm had got the result which is near the facts.
Keywords/Search Tags:Bioinformatics, sequence alignment, center star method, pattern matching, similarity analysis
PDF Full Text Request
Related items