Statistical methods for mass spectrometry proteomics

Posted on:2017-10-27

Degree:Ph.D

Type:Dissertation

University:The University of North Carolina at Chapel Hill

Candidate:O'Brien, Jonathon

Full Text:PDF

GTID:1454390005984961

Subject:Biostatistics

Abstract/Summary:

PDF Full Text Request

DNA makes RNA makes proteins is the central dogma of molecular biology. While the measurement of RNA has dominated the landscape of scientificc inquiry for many years, often the true outcome of interest is the final protein product. Microarray and RNAseq studies do not tell researchers anything about what happens during and after translation. For this reason interest in directly measuring the proteome has flourished. Unfortunately the direct analysis of proteins often creates a complicated inferential situation. When scientists want to see the whole proteome (or at least a large unknown sample of the proteome) mass spectrometry is often the most powerful technology available. Mass spectrometers allow researchers to separate proteins from complex samples and obtain information about the relative abundance of around 10,000 proteins in a given experiment. However the analysis of mass spectrometry proteomics data involves a complicated statistical inference problem. Inference is made on relative protein abundance by examining protein fragments called peptides. This inference problem is complicated by the two intrinsic statistical didifficulties of proteomics; matched pairs and non-ignorable missingness, which combine to create unexpected challenges for statisticians. Here I will discuss the complexities of modeling mass spectrometry proteomics and provide new methods to improve both the accuracy and depth of protein estimation. Beyond point estimation, great interest has developed in the proteomics community regarding the clustering of high throughput data. Although the strange nature of proteomics data likely causes unique problems for clustering algorithms, we found that work needed to be done regarding the statistical interpretation of clustering before any special cases could be considered. For this reason we have explored clustering from a statistical framework and used this foundation to establish new measures of clustering performance. These indices allow for the interpretation of a clustering problem in the commonly understood framework of sensitivity and specificity.

Keywords/Search Tags:

Mass spectrometry, Statistical, Proteomics, Clustering, Proteins

PDF Full Text Request

Related items

1	Statistical methods for the analysis of mass spectrometry-based proteomics data
2	Novel Isolation And Identification Approaches For N-termini Analysis Of Proteins Using Bio-Mass Spectrometry And Protein Chemistry
3	Functional proteomics analysis by nHPLC-muESI ion trap mass spectrometry
4	Understanding the role of chemical and physical processes as related to the quantification of proteins by ESI-FT-ICR mass spectrometry
5	Methods development in biological mass spectrometry: Application in glycoproteomics
6	Mass spectrometry-based proteomics: Qualitative and quantitative studies
7	Proteomics Of Endometriosis
8	Mass Spectrometry Analysis Of Differentially Expressed Proteins In HPV Associated Cell Lines And Explore The Mechanism Of E7 Induced Re-replication
9	Differentially Expressed Proteins Analysis Based On Mass Spectrometry In Colorectal Cancer
10	Mass Spectrometry Applications for Comparative Proteomics and Peptidomic Discovery