Sequential Monte Carlo and Dirichlet mixtures for extracting protein alignment models

Posted on:2005-12-06

Degree:Ph.D

Type:Dissertation

University:Stanford University

Candidate:Logvinenko, Tanya

Full Text:PDF

GTID:1450390008992484

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

In this dissertation we present various methods that can be used for aligning a pair of protein sequences or for finding similarities between multiple sequences. Commonly used non-Bayesian methods for aligning biological sequences often produce alignments which maximize some scoring function. However, the choice of the model parameters can strongly influence the resulting alignment. In addition, in the absence of a statistical model significance of the produced alignment can not be assessed. To address these issues we introduce formulation of the sequence comparison problem in Bayesian terms. Two Bayesian methods for aligning a pair of protein sequences are described and implemented. A rule for assessing significance of the resulting alignments is prescribed. For aligning multiple protein sequences a novel Bayesian method is proposed. Using Bayesian formulation of a problem and sequential Monte Carlo framework, the method progressively includes all sequences into the alignment. The resulting final alignment is improved by incorporating such Bayesian methodologies as Gibbs sampler and simulated annealing. Comparison study of the methods for biological sequence alignment (which uses the sets of protein sequences for which the true biological alignments are known) is presented. The novel Bayesian methods for pair-wise and multiple sequence alignment perform at least equivalent to or often better than the other methods for sequence alignment.

Keywords/Search Tags:

Alignment, Protein, Methods, Aligning

PDF Full Text Request

Related items

1	The Alignment-free Methods Of Protein Sequences And Their Applications
2	Study On Methods For Predicting And Aligning Metabolic Pathways
3	Protein Structure Alignment Methods Based On AFPs
4	Research On Clustering And Aligning Methods For Gene Expression Time Series Data Analysis
5	Global Alignment Of Protein Structure In Cryo-EM
6	Algorithms Of Aligning The Third-Generation Sequencing Sequences And Picking The Operational Taxonomic Units
7	Research On A Profile Based Alignment Method For Protein Sequences
8	An exploration of protein structure: Prediction, alignment, and theoretical interactions
9	Research On Alignment-free Methods For Bioinformation Sequence Analysis
10	Pairwise Network Alignment Research Based On Protein-protein Interaction Network