The Research On The Discovery Of Transcription Factor Binding Sites Based On Genetic Algorithm

Posted on:2012-11-21

Degree:Master

Type:Thesis

Country:China

Candidate:L F Tian

Full Text:PDF

GTID:2120330338991235

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The regulation of gene expression is the key of understanding the biology genetic mechanism and solves the mystery of biology. Transcription is a crucial step of gene expression. Identifying and commenting the transcription factor binding sites plays a key role in researching transcription regulation and constructing expression regulation network. Along with the human's deeply research on biology and the development of computer technology, computational discovery algorithm has become the power auxiliary tool of the traditional experimental annotation method. Accurate identification algorithms can help people to identify target genes of different transcription factor binding sites, which provide accurate data for biological experiment and can promote experiment. At present, the existing algorithm can be generally classified into two categories, that is, algorithms based on consensus sequence and based on position weight matrix. However, these algorithms tend to fall into local optimum, and it is hard to get global optimal solution.This paper proposes two transcription factor binding sites discovery algorithms. One is based on the improved traditional genetic algorithm, arming to receive the global optimal solution; the other combines the genetic algorithm and Gibbs sampling algorithm, and uses position weight matrix model. This algorithm is suitable for various biological data.(1) The first method is proposed for the sequences that contained several transcription factor binding sites. We define a new fitness function, adding the variable'appear number'into this function, so the sequences contain multiple transcription factor binding sites have higher score.(2) The second method combines the genetic algorithm and Gibbs sampling algorithm. This method uses position weight matrix model. Position weight matrix model has many advantages; such as simple calculation process, few parameters, can resist background noise. Combine the genetic algorithm and position weight matrix; we firstly generate a position weight matrix randomly by the initial sequences, then get a converged position weight matrix through genetic algorithm. At last, we can discover the transcription factor binding sites by this converged position weight matrix.Finally, verifies and analyzes methods presented in the paper by experiment. Compare and analyze the experimental results with the existing method and the information labeled in TRANSFAC and DBTSS, shows the correctness and affectivity of the proposed methods.

Keywords/Search Tags:

Transcription factor binding sites, Gibbs sampling algorithm, Position weight matrix, Genetic algorithm, TRANSFAC, DBTSS

PDF Full Text Request

Related items

1	Prediction Of The Correlation Of Triplet Transcription Factor Binding Sites Based On PWMSA
2	The Dynamic Method Of Transcription Factor Binding Sites Recognition Based On Genetic Algorithm And Position Specific Scoring Matrix
3	Based On The Information Of Sequences To Predict The Transcription Factor Binding Sites And Promoter
4	The Research For Recognition Of Transcription Factor Binding Sites Based On Genetic-Neural Network
5	Algorithm Research On The Problem Of Transcription Factor Binding Sites Identification
6	Transcription Factor Binding Sites Prediction Algorithm Study And Application
7	Genome-wide Analysis Of Transcription Factor Binding Sites And Gene Mutation Of Genetic Disease
8	An Algorithm To Detect TFBSs Based On ChIP-seq Data
9	Sequence-Based Prediction Of Proteingdp/GDP Binding Sites
10	Gibbs Sampling Algorithm Under Optimal Subsample