Font Size: a A A

Studying And Development Of New Methods For Mass Spectrometry Data Analysis In Proteomics

Posted on:2012-11-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:F MoFull Text:PDF
GTID:1110330371469225Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Mass spectrometry (MS) based proteomics research approach is an automated, high-throughput and accurate technique for proteins identification and quantification in high complexity samples. It has been wildly applied in protein studying recently. For those large-scale mass spectrum data, we have to employ bioinformatics methods to interpret them biologically to achieve proteins identification and quantification. Therefore, the study and development of bioinformatics methods has been playing a key role in MS based proteomics research. Application of new methods in protein identification will enable us to acquire more comprehensive information, including post-translational modifications, protein products of mutations/variants, novel alternative splicing isoforms and novel proteins. Besides, by using new algorithms, we can obtain more reliable results. Especially for quantitative proteomics datasets, new methods will lead us to reach more accurate ratios of proteins differences between two or more physiological states of a biological system. Thus, aiming at these two main aspects of data analysis in proteomics, we proposed and developed two new methods throughout this thesis, which can improve the results of proteins identification and quantification.The first project:in order to identify novel alternative splicings at the protein level, we designed and built a novel putative exon-exon junction translation protein database by using human genome annotation in Ensembl core database. It represents all novel exon-exon junctions with compatible phases, and nucleotide sequences were translated into amino acid according to corresponding reading frames, and then stored them in a database file. By using X!Tandem and SEQUEST to search a batch of liver tissue culture secreteom proteomics MS/MS datasets against our database, we identified a total of488non-redundant exon-exon junction peptides that represent novel exon-exon combinations, which reside in395genes and proved novel alternative splicing isoforms. Compared with other methods, this approach considered more comprehensive probability of exon-exon combinations, and identified alternative splicing isoforms more efficiently by searching against exon-exon junction sequences only.The second project:aiming at MS data analysis in stable isotope labeled quantitative proteomics experiments, we proposed a wavelet-based de-noising algorithm. In this algorithm, we developed a new threshold filter function and spatial adaptive algorithm to distinguish noise and optimize signal more precisely, and developed a new quantitative software tool WaveletQuant by integrating it into Trans-Proteomic Pipeline (TPP). Using a batch of ICAT labeled known mixed ratios of yeast extracts quantitative proteomics datasets, we showed that WaveletQuant was able to distinguish noise effectively in peptide single ion chromatograms, select true signal peaks regions accurately and calculate peptides abundance precisely.Overall, in these two projects, we obtained new results by developing and applying two new methods in re-analysis of previous proteomics MS datasets. It strongly indicated that bioinformatics methods study would promote proteomics research effectively.
Keywords/Search Tags:proteomics, MS, alternative splicing, quantitative proteomics, wavelet de-noise
PDF Full Text Request
Related items