Font Size: a A A

Trichomonas vaginalis genome annotation validation and protein function prediction via proteomics and phylogenetics

Posted on:2010-10-24Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Hayes, Richard DonaldFull Text:PDF
GTID:1444390002479627Subject:Biology
Abstract/Summary:
Trichomonas vaginalis is one of the most prevalent non-viral sexually transmitted infectious parasitic protists in the world, with over 200 million new cases of infection every year. T. vaginalis colonizes the human urogenital tract, where it remains extracellular and causes lesions in vaginal epithelia, leading to symptoms in infected women ranging from inflammatory disease, infertility and pregnancy complications, and predisposition to HIV infection and cervical cancer. Genome sequencing of a laboratory strain of T. vaginalis was completed in April 2005 by The Institute for Genomic Research. The current version 1.0 of the genome annotation was produced by a completely automated process that ultimately produced a set of nearly 60,000 putative gene models, most consisting of single exons. In a majority of cases, existing molecular biology and biochemistry evidence, sequence homology, or enzymatic domain homology matches were not correctly incorporated into these annotations, resulting in more than 80% of predicted genes receiving the annotation "conserved hypothetical protein." Based on comparison to the genomes of other divergent, unicellular parasites, a high percentage of hypothetical proteins with insignificant similarity to proteins present in other organisms is expected; however 80% is unusually high. To begin the process of validating the current annotation, peptide data from several proteomics investigations of T. vaginalis were modeled against the current genome annotation as sources of gene expression evidence. The results presented in this dissertation include strong predictions for instances where sequencing or assembly errors have contributed to annotation errors: incorrectly predicted start codons leading to exon boundary errors, and frameshift errors resulting in the annotation of single genes as two truncated genes in close proximity. All data will be become public by its incorporation in the current genome website, http://trichdb.org, to facilitate continued analysis by the full /T. vaginalis/ research community.
Keywords/Search Tags:Vaginalis, Genome, Annotation, Current
Related items