Font Size: a A A

Mass Ms Data Depth Analysis Of The New Method And Its Application

Posted on:2013-02-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:M NiuFull Text:PDF
GTID:1111330374960958Subject:Drug analysis
Abstract/Summary:PDF Full Text Request
Biological mass spectrometry has become an important supporting technique of proteomic researches. Due to the high-throughput, high accuracy and high sensitive, liquid chromatography tandem mass spectrometry (LC-MS) has played a more important role in large-scale protein identification. With the improvements in the scan speed of mass spectrometry, more and more spectra are generated and make it challenging to identify proteins in complex mixtures. As a result, automatic data processing, such as database searching, is essential for spectra interpreting in high-throughput proteomics, but even for high-resolution tandem mass spectrometry, the successful identification rate of spectra is below30%. Many factors contribute to this, including the complexity of the sample, uncertainty during the sample preparation and differences in data acquiring and processing.In order to reduce these influences, standards are usually prepared and used as a reference to evaluate the process of mass spectrometric data generation and analysis. Synthesis peptides, with the virtues of known sequences, simple structures and less vulnerable to external pollutants, are very suitable for the construction of standards. In this study,30unique peptides from15proteins at about five different abundances in Thermoanaerobacter tengcongensis (TTE) were selected as standards. The physicochemical properties of these peptides are close to the actual sample, and with low homogeneity to yeast proteins, these peptides can also be used to construct a complex standard samples with yeast. The chromatographic and mass spectrometric characterizations have shown that, the chromatographic purities of the30synthetic peptides are more than99%, and the sequences of them are correct.The analysis of high-resolution mass spectrometry datasets generated from the mixture of30standard peptides has indicated that, the high-resolution mass spectrometer greatly improves the precursor ion mass accuracy, but it is still insufficient to get a well interpretation of the spectra. The main reasons are the characteristics of the mass spectrometer and the parameter settings, for example, with the combination of dynamical selection and the broad isolation width (usually2~5Da) used with the precursor ions, co-eluted precursor ions with similar mass-to-charge ratios (m/z) may simultaneously be fragmented and result in a chimera spectrum. Chimera spectra will be dramatically increased along with the increase of the complexity of the sample, and the low identification rate of chimera spectra has become a main obstacle of the spectra interpreting.As the identification of chimera spectra plays an important role in spectra interpreting, in this study, more efforts were taken to analyze the characters of chimera spectra and to identify the chimera spectra with these characters. There are two main reasons contributed to the low identification rate of the chimera spectra, one is the undetermined monoisotopic masses of the co-fragmented precursors, the other is the influence of unidentified fragment ions belonging to unreported precursor ions. For the former, we proposed a novel peak intensity ratio-based monoisotopic peak determination algorithm (PIRMD) for rapidly determining the monoisotopic peaks from the MS scans of chimera spectra. Monoisotopic peaks in non-overlapping clusters are detected by the edge features of the isotopic peak intensity ratios. For multiple overlapping clusters grouped as one cluster, monoisotopic peaks can be detected by an advanced estimation of the similarity between the estimated and the experimental isotopic distribution based on the isotopic peak intensity ratios. The results on standard datasets and actual samples demonstrated that PIRMD could notably improve the successful identification rate of the spectra by identifying more chimera spectra, and approximately25%of the identified spectra, are chimera spectra. For the second reason contributed to the low identification rate of chimera spectra, we introduced a chimera identification algorithm based on the fragment ion pairs (CHIFP), which utilized the characters of accurate precursor masses and the complement of the fragment ions associated with the precursors. The results on reference datasets illustrated that, with the accurate precursors, about20%more chimera spectra with weak precursor intensities filtered by CHIFP can be identified correctly. For the standard peptide samples and actual samples, there is no significant improvement in the identification of the spectra filtered by CHIFP compared to PIRMD, but with the results on TTE dataset, CHIFP can effectively distinguish more fragment ions from chimera spectra, which made a contribution to the identification of more peptides and more proteins.
Keywords/Search Tags:Liquid chromatography tandem mass spectrometry, High-resolutionmass spectrometry, Standard peptides, Database searching, Interpretation of massspectra, Dynamic exclusion, Isolation width, Chimera spectra, Monoisotopic peaks, Proteomics
PDF Full Text Request
Related items