Font Size: a A A

A Comprehensive Survey Of Human Proteome By Analyzing Tandem Mass Spectrometry And RNA-sequencing Data

Posted on:2014-05-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:P LiFull Text:PDF
GTID:1220330482468221Subject:Biomedicine
Abstract/Summary:PDF Full Text Request
A central goal of genomics research is to characterize genes and their encoded proteins in the human genome. Here we cumulatively identified 249,688 unique peptides within 156,970 protein isoforms, based on 180 million tandem mass spectrometry (MS/MS) spectra obtained from EBI-PRIDE, NCBI-Peptidome, NIST and BPRC by using Firmiana, an Integrated Platform for Mass Spectrometry-Based Proteomics Studies Based on Galaxy Frameworkthe, towards an integrated protein library which including 278,101 proteins and isoforms generated from curated human proteome (Swiss-Prot and RefSeq) and predicted human proteome (AceView and TrEMBL). These peptides and associated 17,633,234 peptide spectra have been incorporated in a comprehensive curated database SHuPPD (SEQC Human Peptide and Protein) as a resource for future mass spectrometry proteomics analyses, such as spectral library searching and development of targeted proteomics assays. Integrating MS/MS data with RNA-seq data generated by the Sequencing Quality Control consortium (SEQC), we confirmed that 89.3%,59.5% and 86.6% protein-coding genes from RefSeq, AceView and ENCODE were translated, respectively. Among these protein-coding genes,15,745 are novel according to the AceView annotation. Transcripts of novel proteins are widely expressed in human tissues. In addition, we discovered several uncommon events such as translation initiated from rare Kozak sequences. Results also have been incorporated in the SHuPP database that integrates evidences at cDNA, RNA-Seq and MS/MS levels as a new landscape of human proteome to further accelerate proteomics research and facilitate gene annotation.
Keywords/Search Tags:Human proteome, Liquid chromatography-mass spectrometry, Whole Transcriptome Shotgun Sequencing, Spectral Library, Novel human protein
PDF Full Text Request
Related items