Font Size: a A A

Recognition Of The Functional Sites Based On The DNA Sequence

Posted on:2011-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q WuFull Text:PDF
GTID:2120360305976354Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The functional sites in the DNA sequence are widely analyzed because of their relation with the gene regulation and transcription. How to recognize these functional sites accurately based on the DNA sequence has been a topic of long-standing interest in the Bioinformatics.In this paper, a detection algorithm is firstly proposed for the prokaryotic promoters using an improved position weight matrix (PWM) method based on an entropy measure. In this method, the conservative sites of the prokaryotic promoters are extracted according to an entropy measure, and then two improved position weight matrices are constructed based on the training set. By using the values of the matrix elements in the specific columns corresponding to the extracted conservative sites, the test sequences are scored and subsequently classified. Experimental results on several datasets show that the proposed algorithm outperforms the existing ones in sensitivity, specificity, correlation coefficient and precision.Secondly we develop a novel pattern recognition based approach to identify nucleosome positions. This technique combines two methods for nucleosome pattern matching and ambiguity elimination. Firstly the matched mirror position filter is used to match the patterns in the DNA sequence, and then the probabilistic relaxation labeling, which is widely used in image processing, is used to eliminate the noise in the DNA sequence by the contextual information. We then applied this combined framework to the Saccharomyces cerevisiae (yeast) genome. The resulting nucleosome occupancy maps of the yeast show that the accuracy of our proposed algorithm has been significantly improved. Experimental results also show that maybe a kind of mechanism is shared by the nucleosome occupancy maps of different species.
Keywords/Search Tags:DNA sequence analysis, promoter, PWM, entropy, conservative sites, matched filter, probabilistic relaxation labeling, nucleosome
PDF Full Text Request
Related items