| Objective: To build a MS-based gastric mucosa proteome atlas with a spatial resolution that covers the protein abundance range of seven distinctive anatomical sites of the human stomach from healthy people. To lay the foundation for the study of the various biological functions from different regions of the stomach and find invaluable use in the analysis of gastric mucosa disease and its application in scientific and reasonable treatment. Methods: The in situ digestion with sRP methods and high throughput MS platform had been introduced and developed in our lab. Moreover, a one-stop proteomics analysis platform- Firmiana for MS raw data was built to increase the ability of data analysis. With these advanced technologies, we have developed a systematic workflow, which consisted of clinical specimens collecting and sorting, enriched proteins screening, and bioinformatics analysis.According to the stomach’s anatomical architecture, gastric mucosa has been divided into seven regions: cardia(Ca), fundus(Fu), lesser curvature(LC) and greater curvature(GC) of corpus, angular insecure(AI), antrum(An) and pylorus(Py). We have measured 10 specimens for each region, 70 specimens in total, for proteomics analysis. Databases used in the iBAQ(intensity-based absolute quantification) approach for absolute protein quantification, and then normalized by the total iBAQ values to minimize the error of different qualities of loading samples.Bootstrap method has been used to evaluate the variation of protein expression levels for each anatomical region of gastric mucosa with small amount of samples(n<10). Shapiro-Wilk test utilizes the null hypothesis principle to check whether a sample came from a normally distributed population. The protein abundance range of each region was calculated with rational test based on the distributed population.REPs(Region enriched proteins) were identified based on student t-test for normal distributed data or WMW-test for non-normal distributed data significance(P value < 0.05) and fold change(> 2 fold) of the average or median value of each region against the average or median of all other regions. Pearson’s correlation coefficient has been used to estimate the relationship between REPs from one region with those from each region of the others. Results: By introducing and developing high throughput MS platform and in situ digestion with sRP methods, we had improved the identification of tissue-based proteome coverage level from 4000 to 6000 gene products, while the machine running time for each tissue sample was shortened from 3-4 days to 10 hours. With these advanced technologies, we have measured 10 specimens for each region, 70 specimens in total, for proteomics analysis. A total of 6375 gene products were identified in all seven anatomical regions and 9875 gene products in at least one region with a peptide and gene protein product false discovery rate(FDR) of 1%. With the label-free quantification algorithm of iBAQ method, the proteomes of gastric mucosa covered 10 orders of magnitude. Transfer factors(TFs) and kinases with low expression levels can be identified through complicated enrichment processes. While we still observe 719 TFs and kinases in this study without any enriched methods. For example, UBTF(upstream binding transcription factor, RNA polymerase I) and PIK3R4(phosphoinositide-3-kinase, regulatory subunit 4) are identified in low expression abundance with 1/200 and 1/1000 of GAPDH(Glyceraldehyde-3-Phosphate Dehydrogenase), the housekeeping proteins, respectively. In conclusion, the gastric mucosa proteomic study with anatomical resolution has achieved in-depth coverage and high throughput proteomic analysis.Compared with the transcriptomics studies of stomach from EBI-AE database, 65.3% gene products of transcriptome have been observed in this research. 692 novel gene products, which are important supplementations to previous studies, have been found with protein evidence. The radio of gene expression of each chromosome ranges from 30% to 40% except for Y chromosome with little relationship to the difference of anatomical regions.We used Bootstrap method to evaluate the variation of protein expression levels for each anatomical region. We calculated the mean values and coefficient of variation(CV) for each identified gene product and looked for overall improvement in CV and accuracy of the mean estimates as the size of the sample increased for each region. To see the spread of means and CVs we plotted the CVs of the mean and CV populations. It was clear that once 8-10 experiments were performed, no further improvement was seen. In other word, we generated a stable baseline proteomic profile with 10 gastric mucosa biopsies of each region measured individually.A simple common task is to compare the expression level of a single protein across the seven regions. We found that GAPDH showed high expression throughout all the regions, while high levels of GAST(gastrin) were confined in distal stomach: angular insecure, antrum and pylorus. ERBB2(Erb-B2 Receptor Tyrosine Kinase 2) showed extreme lower expression than GAPDH and GAST. So we checked whether the data of each region came from a normally distributed population by Shapiro-Wilk test. On the whole, the variation of entire stomach might be larger than every region. According to the distributed characteristics, we calculated the normal reference range of protein abundance for each region in reasonable ways: means and 95% confidence interval for the data of normal distribution, median and extremum for non-normal distributed data(except for zero values).REPs were identified based on the statistical significance(P value < 0.05), fold change(> 2 fold) and expression frequency(>60% or 70%). Despite the stomach is divided into 7 regions in anatomy, we found that the stomach could be divided into two parts at the protein level- the proximal and the distal part of the stomach according to the REPs. The results are verified by GO annotation analysis of region elevated proteins and KEGG pathway analysis of TFs and kinases. We also found that the distal stomach played an important role in the immune response process. Conclusion: Here we report a MS-based gastric mucosa proteome atlas with a spatial resolution that covers the protein abundance range of seven distinctive anatomical sites of the human stomach from healthy people. The dataset lays the foundation for the study of the various biological functions from different regions of the stomach and will find invaluable use in the analysis of gastric mucosa disease and its application in precision medicine. |