Sequence classification learning using methods derived from entropy estimation

Posted on:2000-06-29

Degree:Ph.D

Type:Dissertation

University:Rutgers The State University of New Jersey - New Brunswick

Candidate:Loewenstern, David Matthew

Full Text:PDF

GTID:1468390014965818

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

There are many problems requiring the characterization, clustering, or classification of sequences of characters drawn from a fixed alphabet, including the classification and information entropy estimation of biological sequences such as DNA sequences. This work describes a method for learning the classification of sequences, primarily biological sequences, by exploiting the string-like nature of these problems by constructing models for each class of sequence, using the model's ability to predict each character of a test sequence as a measure of the similarity between the sequence and the class of sequences used to build the model. The model predicts each character by combining predictions made by many "experts," each of which predicts a character based upon a set of characters from a training set with a similar context of preceding characters. Different experts use different similarity criteria and different context sizes. Through the use of this method, lower, more accurate entropy estimates of DNA sequences are obtained. These estimates are then shown to lead to successful classification of DNA sequences into their three-dimensional structural groups.

Keywords/Search Tags:

Classification, Sequence, Entropy

PDF Full Text Request

Related items

1	Some Results On Injective Mappings Of Primitive Sequences Modulo Prime Powers
2	Research On Attention-based Model For Sequence Classification
3	Vulnerability Classification Based On Text Classification Technology
4	Design And Implementation Of Hotel Reviews Classification System Based On The Maxinum Entropy
5	On Two Pseudorandom Sequences
6	The Classification Of DNA Sequence Based On Support Vector Machine
7	The Classification Of Dna Sequence Based On Support Vector Machine
8	Research On Generation And Performance Of Spread Random Sequence Based On Physical Entropy Source
9	The Research On The Classification Model Based On Rough Set And Entropy
10	An Improved Evidence Classification Synthesis Method Combined Information Entropy