Font Size: a A A

Development And Application Of A One-stop Platform For Analysis And Visualization Of Quantitative Proteomics Data

Posted on:2020-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:K K XuFull Text:PDF
GTID:2370330599452391Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Proteomics is one of the popular research fields in the post-genomic era.With the improvement of instrument accuracy and identification algorithms,the research focus has been gradually shifting from identification to quantification for proteomics.Selection of differentially expressed proteins under diverse conditions based on quantitative data is a critical direction in quantitative proteomics research,which plays an essential role in understanding protein function and life activities.Numerous tools and packages for selection of differentially expressed proteins have been published in recent years,but they are generally suffering from blame due to complicated installation and update,poor compatibility of upstream tools,limited functions,high thresholds for use and poor display quality.These disadvantages have caused certain difficulties in the promotion and application of proteomic technology.It is urgent to introduce a comprehensive and easy-to-use data analysis tool.In order to explore and address the problems mentioned above,we focus on the development and application of proteomic one-stop analysis and visualization platform called MyOmics.Details are as follows.(A)In the first instance,we investigated mathematical principle as well as applicable conditions of missing value imputation,normalization,statistical analysis and functional enrichment in the selection of differentially expressed proteins.We introduced supernumerary machine learning methods to visualize high-dimensional data.We completed the programming and function docking of the primary methods using Python and R programming language.The core data structure in our programming is a multi-indexed data frame.We optimized the algorithm of missing value imputation for the extreme cases that specified proteins in all samples were not quantified.In addition,the automatic selection of statistical test methods based on data distribution has been realized.(B)Based on a web-based platform for data intensive biomedical research named Galaxy,analysis programs were posted online via writing XML configuration files.Then we used these tools to construct data analysis workflow for two types of mainstream omics experimental design types,i.e.,single factor designs with two-level and single factor designs with multi-levels.In order to solve the inherent problems of Galaxy including the inability to user access statistics,the poor display of the analysis results and nonsupport to generate dynamic forms,we designed and created a separate user interface.User registration,usage statistics,online display of multi-type charts and many other features have been achieved.(C)Ultimately,we have successfully applied the analysis method and data processing workflow realized by MyOmics in the research of “In-depth Serum Proteomics Reveals Biomarkers of Psoriasis Severity and Response to Traditional Chinese Medicine”.We comprehensively analyzed quantitative proteomics data measured through data-independent acquisition mass spectrometry and antibody arrays.After missing value imputation,normalization,statistical test,correlation analysis,and other steps,we obtained hundreds of differentially expressed serum proteins spanning a total of ~ten orders of magnitude.We successfully identified the potential protein biomarker PI3 associated with the diagnosis of psoriasis.This conclusion can be verified through ELISA,which demonstrates the clinical utility of our novel proteomic assays.This application example confirms the availability of our platform.The selection of differentially expressed proteins is a core issue in proteomics research,which is critical for solving biological and clinical problems.Comprehensive and easy-to-use analysis tools are essential in research.We took the selection of differentially expressed proteins as the core issue and carried out the programming implementation and function optimization of each function in the analysis process.The research results are presented in the form of a web client.The platform and method have been widely used in biomedical research.
Keywords/Search Tags:Proteomics, Differentially expressed proteins, Statistical analysis, Functional analysis, Visualization
PDF Full Text Request
Related items