Computing RNA coding spaces and efficient combinatorial library construction

Posted on:2002-11-05

Degree:Ph.D

Type:Thesis

University:State University of New York at Stony Brook

Candidate:Cohen, Barry

Full Text:PDF

GTID:2460390011993968

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

The main classes of biomolecules---nucleic acids and proteins---are linear chains, or sequences, whose elements are drawn from small sets of chemical building blocks. They are accurately modelled as strings over a finite alphabet. The imposition of constraints on the composition of these sequences, as frequently occurs both in natural evolution and laboratory manipulation, gives rise to combinatorial and optimization problems which can fruitfully be addressed by computational means.; This dissertation examines three such problems: (1) Chapter 1 examines a problem in combinatorial chemistry, which is a technique for the systematic synthesis and screening of large libraries of small organic molecules. In drug development, such libraries are designed to various standards, such as effectively sampling a given space of compounds. It is then desirable to be able to efficiently synthesize the library. We propose a new algorithmicly-intensive optimized split-synthesis technology for fabricating combinatorial chemistry libraries and evaluate its potential through extensive simulation. (2) Chapter 2 develops evidence that there is natural selection on a physical property, the stability, of messenger RNA (mRNA) sequences in microbes. We fix the target protein, which constrains the coding sequence, and the probability of use of each codon. Stability is defined as the strength of pairwise bonds (secondary structure) between sequence elements. The technique is a statistical comparison between computer simulations of natural and synthetic mRNAs. It shows that among a large sample of such sequences (more than 27,000 sequences from 36 genomic structures in 34 species), a disproportionate number of natural sequences have very high or very low stability. (3) Chapter 3 examines the algorithmic problem of designing very stable or very unstable mRNA sequences which code for a target protein. We derive a dynamic programming solution to the most-stable-sequence problem (MSSP) which is polynomial, and asymptotically no more complex than secondary structure prediction. We show that the least-stable-sequence problem (LSSP) in NP-complete, and develop and two heuristics for the construction of such sequences. In both cases, the solutions are implemented and tested on real-world data.

Keywords/Search Tags:

Sequences, Combinatorial

PDF Full Text Request

Related items

1	Computing RNA coding spaces and efficient combinatorial library construction
2	Congruences Involving Combinatorial Sequences And Truncated Hypergeometric Series
3	On The Combinatorial Identities And The Mean Values Of Some Interesting Sequences
4	Riordan Arrays Methodã€Combinatorial Sums And Its Asymptotic
5	Degree sequences, forcibly chordal graphs, and combinatorial proof systems
6	The Generating Function And The Combinatorial Sequences
7	Identities And Asymptotic About Some Combinatorial Sequences
8	Studies On Complementary Sheffer Sequences And Symmetric Sheffer Sequences
9	Combinatorial Sequences And The Related Recurrence Relations
10	Combinatorial Identities Involving The Binomial Coefficients And Several Integer Sequences