Font Size: a A A

Several Mathematical Models For Comparison Of Biological Sequences/Structures And Their Applications

Posted on:2010-02-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q DaiFull Text:PDF
GTID:1100360308969786Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Recently, major advances of genomic technologies and the development of analytical tech-nologies of physical structure have led to an explosive growth of biological sequence and structure databases. The deluge of databases, in turn, produces new questions such as how to analyze, process and store these data, which are serious challenges to Computer sciences, Mathematics , and so on. Meanwhile, many researchers who study science are attracted by these challenges and get interested in life sciences. Thus, computational molecular biology emerges as a new and developing interdiscipline. The research area of computational molecular biology is very wide, it includes sequence comparison, gene recognition, molecular evolution, comparative ge-nomics, RNA and protein secondary structure comparion, and so on. Most of them are based on sequence and structure comparison. So sequence and structure comparison is not only one of the most basic and important subjects, but also has further effect on the study of life science. This dissertation mainly studies many mathematical models in this area, the main results can be summarized as follows:1. In chapter 2, we proposed a W-geometrical representation model of DNA sequences, and developed an efficient algorithm to search the section in sequence containing only particular bases. With the consideration of dual bases, a PNN-geometrical representation model was proposed based on the concepts of cell and system, and an efficient numerical descriptor was extracted. The degradation phenomena of distance matrices while characterizing curves was pointed out. In order to overcome this limitation, we provided a new matrix, curvature matrix.2. In chapter 3, the Markov model of biological sequences and structures was constructed, the differences of the k-step transition probability as for numerical characterization of biological sequences were compared, and this model was applied to some fields, such as structural com-parison, structural similarity search and evolutionary analysis. With the help of the weighted relative entropy, we constructed a mixed model based on word statistical model and Markov model, experimental assessment, performed via a widely employed evaluation method, demon-strates that our mixed model improved the ability of extracting information.3. In chapter 4, based on the random of the distribution of elements of biological sequences and structures, the random distribution functions were defined. Using linear regression model, we could find the relationships of these functions and the holistic changes of the elements. The similarity between different structures can be obtained by comparing the differences between their corresponding linear regression models. Therefore, it reduces the complexity of structural comparisons.4. In chapter 5, the definition of "protein space" and the k-word statistical model of "pro-tein space" were presented based on the score matrices of amino acids. The edit distance of k-word was designed to compare different protein sequences. By systemic comparative anal-ysis, reasonable suggestions on how to construct efficient "protein space" and how to choose reasonable measures were proposed.
Keywords/Search Tags:Graphical representation model, Markov model, Linear regression model, Word statistical model, Biological sequence comparison, Biological structure comparison
PDF Full Text Request
Related items