Font Size: a A A

Research Of Cis-regulatory Module Discovery Method Based On HMM Model

Posted on:2013-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:S R ZhengFull Text:PDF
GTID:2230330395455627Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Transcription factor binding site (TFBS), which is a specific DNA short sequence inregulatory region, called motif, is an important regulatory element. In Eukaryote,several transcription factor binding sites constitute cis-regulatory module (CRM),providing more complex regulation for expression of neighboring genes. The study ofcis-regulatory module is gradually replacing that of single TFBS. Computationdiscovery of cis-regulation module becomes a hot point in bioinformatics.In this thesis, we focus on the application of hidden morkov model (HMM) in thediscovery of cis-regulation module. The main works are as follows.Having studied the HMM models of the present CRM recognition methods, we findthe description of the correlation of motifs in module is not well. The thesis putsforward a new HMM model describing the correlation. Based on this model, the thesisputs forward a rule to set part of the parameters and a parameter learning process basedon Viterbi algorithm. Afterwards, we present both processes of this new algorithm andBaum-Welch algorithm to learn part of the parameters of the model, also the process ofViterbi algorithm to recognize the motifs. Finally, with the experiment based onsynthesize sequences we prove its good performance to recognize and analyzecorrelative motifs in regulatory region.We put forward a CRM recognition method TSHAS, a two-stages method usingHMM and sequence statistic. In the stage of HMM, the HMM describing the correlationof adjacent motifs is build on all the selected motif examples in sequences byhypothesis testing, and is aimed to get rid of overlap of motif examples. Then wedecode the sequences using the modified Viterbi algorithm, and get the optimizedparameters using the learning process based on Viterbi algorithm. In the stage ofsequence statistic, we simulate the distribution of unit window scores the HMM output.Then, through the computation of p-value of unit window scores, we find thesignificantly scored windows of different length as the results to output. In theexperimental section, we compare TSHAS with several methods at present, and verifyits advantage among CRM discovery methods.
Keywords/Search Tags:motif, cis-regulation module, hidden markov model, Viterbi algorithm, P-value
PDF Full Text Request
Related items