Font Size: a A A

Application And Study Of Optimal Methods In Bio-Sequences Alignment

Posted on:2011-11-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:K ChenFull Text:PDF
GTID:1100330332477636Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Biological sequences alignment is the most important and basic research field in bioinformatics. It is also an important method of studying the homology between species. With the rapid increase of bio-sequence data, how to improve both efficiency and sensitivity has become an urgent problem in sequence alignment.Optimal methods are applied to improve the efficiency of sequence alignment in this paper. The main contents and contributions can be briefly summarized as follows:1. Self-adaptive sequence alignment method, which is based on Lagrange Constraint Neural Network (LCNN) and digital signal methods, is applied to sequence alignment. Based on risk function and optimal criteria, performance index is calculated to measure the homology of the sequences.2. Theory of spaced seeds and sensitivity model are studied. Optimal search method is proposed to find optimal spaced seeds with maximum probability with limited time resources. By this method, calculation efficiency of spaced seeds can be dramatically improved.3. The Overlap Digraph model, which is dependent on the structure of spaced seed, is constructed and criterion of judging spaced seeds'quality is proposed based on overlap digraph weight function. The experimental results show that overlap digraph model can obtain optimal or near-optimal spaced seeds in very short time.4. With the former achievements and deeper study of in-del (insertion and deletion) seeds, mathematical definitions about indel seeds are proposed, and sensitivity calculation model is also constructed. Overlap complexity method is proposed to calculate in-del seeds, and"flip"function is utilized to select candidate seeds. Given the parameters of weight and similarity, using this method can find optimal seed in quite short time. Experimental results show that in-del seeds are more sensitive than spaced seeds. Based on the method, the optimal in-del seeds with weight from 9 to 15 are calculated.The main contents of this paper are about applying the optimal theories and methods to bio-sequence alignment. Based on the known algorithms, new alignment methods and models are proposed to provide new ideas to realize high speed and effective sequence alignment. By testing on biologic data, the arithmetic is equal or very close to the optimal results, but the calculation time and efficiency are improved. It can suggest some supports and helps for bioinformatics research.
Keywords/Search Tags:bio-sequence alignment, adaptive Lagrange Constraint Neural Network, spaced seed, in-del seed
PDF Full Text Request
Related items