Font Size: a A A

Spectral networks algorithms for de novo interpretation of tandem mass spectra

Posted on:2008-11-24Degree:Ph.DType:Dissertation
University:University of California, San DiegoCandidate:Bandeira, Nuno Filipe CabritaFull Text:PDF
GTID:1440390005972860Subject:Computer Science
Abstract/Summary:
The ongoing success of the proteomics endeavor is the result of a prolific symbiosis between experimental ingenuity and efficient bioinformatics. But despite valuable contributions, the road to a better understanding of protein behavior is still hurdled by significant difficulties in the extensive identification of post-translational modifications and in the sequencing of novel proteins like cancer fusion proteins or antibody chains.;Recently, tandem mass spectrometry (MS/MS) based approaches seemed to be reaching the limit on the amount of information that could be extracted from MS/MS spectra. However, a closer look reveals that a common limiting procedure is to analyze each spectrum in isolation, even though high throughput mass spectrometry regularly generates many spectra from related peptides.;By capitalizing on this redundancy we show that, similarly to the alignment of protein sequences, unidentified MS/MS spectra can also be aligned for the identification of modified and unmodified variants of the same peptide. Moreover, this alignment procedure can be iterated for the accurate grouping of multiple peptide variants. In fact, when applied to a set of spectra from cataractous lenses proteins from a 93-year old patient, spectral networks were able to capitalize on the highly correlated peaks in spectra from variants of the same peptide to rediscover the modifications identified by database search methods and additionally discovered several novel modification events. Furthermore, the combination of shotgun proteomics with the alignment of spectra from overlapping peptides led to the development of Shotgun Protein Sequencing - similarly to the assembly of DNA reads into whole genomic sequences, we show that assembly of MS/MS spectra enables the highest ever de-novo sequencing accuracy, while recovering large portions of the target proteins sequences. Knowing that novel venom proteins have previously provided essential clues for the design of important drugs, we demonstrate our approach on a mixture of western diamondback rattlesnake venom proteins and recover over 85% of the known protein segments at over 90% sequencing accuracy while additionally sequencing several putative novel peptides and single-nucleotide polymorphism variants.
Keywords/Search Tags:Spectra, Sequencing, Mass, Protein, Novel, Variants
Related items