Font Size: a A A

Dempster-shafer Theory Of Evidence-based Combination Of Gene Prediction

Posted on:2007-09-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WengFull Text:PDF
GTID:1110360185494759Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The advent of the high-throughput technologies characterizing the modern ge-nomic disciplines has created an enormous amount of biologically relevant infor-mation effectively turning biology into an information-rich science. At the timewriting, whole genome sequences for more than 1000 organisms are either com-plete or being underway. Driven by this explosion of genome data, interpretation,i.e. annotation, is a grand challenge. Manual annotators as the most reliable meth-ods of producing a definitive view of the human genome play a critical role in thiswork. However, due to the time-consuming and cost of manual methods, initial in-terpretation of the gnomic sequence relied on conclusions derived mostly from thepredictions by computational methods is needed.Although the computational gene-finding programs have greatly improved inrecent years, our ultimate goal is far to be met. Even the best of the programs cannotbe used automatically to identify genes and other genomic elements. Fortunately,there are only a tiny number of exons completely missed by all programs. There-fore, the combination of predictions from the gene-finding programs is a convenientway to improve gene prediction. In this paper we motivate the use of the Dempster-Shafer Theory of Evidence as an appropriate theory for modelling combination ofgene predictions, and give the mathematical framework for combining gene predic-tions of gene-finding programs by using Dempster-Shafer combination rule. Themost notable strength of our method lies in the fact that we can combine resultsof multiple gene-finding programs, given that each of them has provided reliableexon scores. In comparison with an individual program, our method provides a no-table improvement of predictive accuracy both on nucleotide level and exon level,in particular, on the exon level. Besides, in this thesis we propose to use dynamicprogramming to determine the open reading frame (ORF), after combining evidencefrom multiple sources by Dempster-Shafer Theory of Evidence.Tn the last part of this thesis, we show two new results in Linear Unbiased Min-imum Variance (LUMV) estimation. In the first development of LUMV estimation,we have found that under invertible linear transformation to observations, a neces-sary and sufficient condition (5.10) for two LUMV estimates with the transformationand without the transformation to be identical to each other, so as to guarantee thatthe LUMV estimation of an unknown parameter maintains invariant under some in-...
Keywords/Search Tags:Genome, Gene prediction, Dempster-Shafer theory of evidence, Combination, Linear minimum variance estimation
PDF Full Text Request
Related items