Font Size: a A A

Bioinformatics approaches in Drosophila P element gene disruption project and cDNA project

Posted on:2002-06-15Degree:Ph.DType:Thesis
University:University of California, BerkeleyCandidate:Liao, GuochunFull Text:PDF
GTID:2460390011991530Subject:Biology
Abstract/Summary:
This thesis discusses my work on two high throughput sequencing projects at the Berkeley Drosophila Genome Project (BDGP). This thesis also discusses my work on the study of P insertion site preference.; The first part describes the P transposable element gene disruption project. The purpose of the gene disruption project is to use individual, genetically engineered P transposable elements to target open reading frames throughout the Drosophila genome. For each newly generated P insertion line, BDGP sequenced across the junction between the P element and the genomic DNA at the site of the insertion. This allowed precise mapping of the insertion site in the genome and selection of which insertions to retain based on the genomic features around the P element. My work, described in Part 1, focused on the design and implementation of the bioinformatics system for the project.; The second part describes the bioinformatics approaches in the BDGP cDNA project. The long term goal of BDGP cDNA project is to generate a transcript map that provides information on the intron-exon structure, alternative splicing, and transcription start and stop sites, by sequencing cDNAs and comparing them to the genomic sequence. My work, described in Part II, focused on the method of grouping Drosophila ESTs and identifying splicing variants of Drosophila genes based on the genomically aligned ESTs. Using the EST clustering results, we selected additional 5,042 clones for the Drosophila Gene Collection and identified 3,079 genes that have alternative splicing.; The third part describes my work on the study of the P insertion site preference. We found that the physical properties of the genomic DNA at the P insertion sites differ significantly from average chromosomal DNA. We also identified a 14-bp palindromic hydrogen-bonding pattern centered on the 8-bp target site duplication that is generated by P element insertion. We further developed two new encoding methods for the neural network approaches for recognizing P insertion sites. The new encoding methods can also be applied to the DNA patterns of other protein binding sites.
Keywords/Search Tags:DNA, Project, Drosophila, Insertion, BDGP, Element, Work, Bioinformatics
Related items