Font Size: a A A

Computational analysis of human genomic sequence variation and Drosophila small RNA transcriptome

Posted on:2010-03-19Degree:Ph.DType:Dissertation
University:Boston UniversityCandidate:Lee, SoohyunFull Text:PDF
GTID:1443390002976943Subject:Biology
Abstract/Summary:
Bioinformatics is a computational approach to solve biological problems. I applied computational methods to several different problems.;(1) The genome sequences of human and other mammalian species allow us to study human evolution at the genomic sequence level. I studied evolutionary conservation of promoters using human, mouse and dog genomic sequences, and found a significant connection between promoter evolution and function of the genes. Developmental genes and transcription factors tend to have higher conservation upstream of the gene body, whereas housekeeping genes have lower promoter conservation. This result suggests that the genes that require complex regulation have a higher degree of conservation due to an increased number of cis-elements in the promoter.;(2) The variations of the genomic sequence among the human population provide us with useful information about individual differences. For instance, this information can be used for finding disease-associated variants. When different types of variations are mixed, the observed experimental outcomes may deviate from expected. I investigated the deviation of Hardy-Weinberg equilibrium (HWE) of single nucleotide polymorphisms (SNPs) that lie in a copy number variation (CNV), using Bayesian statistics. I address the question 'what is the probability of a SNP being in a CNV, given that it violates HWE. My results suggest that depending on the allele frequency, an underlying CNV can be a major factor causing deviation from HWE, when the sample size is large and genotyping error is below 1%.;(3) Recently emerged next-generation sequencing technology gives us the opportunity to study entire transcriptomes under various conditions. I analyzed millions of sequence reads of small RNAs from fruitfly ovaries, to elucidate the biogenesis mechanisms of Piwi-interacting RNAs (piRNAs) in this organism. PiRNAs are 23-29nt RNAs that suppress retrotransposon activities in the germ cells. Three different proteins, Piwi, Aubergeine (Aub) and Argonaute3 (Ago3), are suggested to generate piRNAs in fruit flies, but the mechanism is poorly understood. By analyzing total small RNAs and RNAs immunoprecipitated with the three proteins in wildtype and ago3 mutants, I obtained new insights about how these proteins may participate in the biogenesis of piRNAs.
Keywords/Search Tags:Sequence, Computational, Human, Rnas, Small, Genes
Related items