Font Size: a A A

Research On Selecting Methods Of Multiple Biomarkers Based On Proteomics Data

Posted on:2022-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:J ShengFull Text:PDF
GTID:2504306575463154Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Biomarkers are indicators that play an important role in judging the occurrence,development and prognosis of diseases.More than half of disease markers are proteins or peptides.It plays a very important role in screening and diagnosing or monitoring diseases before treatment,targeting molecular therapy during treatment,and evaluating treatment response after treatment.In the current literature reports,generally speaking,the combination of biomarkers can play a better clinical and pathological predictive role than a single biomarker in terms of early diagnosis and clinical treatment.In my country,hepatocellular carcinoma is one of the cancers with the highest incidence.Although surgical treatment can play a certain effective therapeutic role in the treatment of hepatocellular carcinoma,the survival rate of patients within 5 years after surgery is less than half,which is one of the several cancers with poor prognosis.Therefore,the study of effective methods represented by the combination of biomarkers to analyze and predict the prognosis of hepatocellular carcinoma will greatly improve the serious status of the disease.Nowadays,the research of strategies and methods for screening protein biomarker combinations has become an important hot spot in the field of hepatocellular carcinoma treatment.At present,the protein biomarker screening method has not started for a long time.At the same time,there are many technical problems to be overcome,and evaluation and analysis of numerous screening methods are urgently needed.This article integrates the existing protein biomarker screening methods,and provides a comprehensive overview of the common methods in the protein biomarker screening process,and discusses the statistical methods and feature screening methods in the training set of the Nature article and the independent validation set of the Cell article.And three sets of high-quality and high-throughput hepatocellular carcinoma data sets,including the laboratory’s self-produced data,are used for protein biomarker combination screening.The training set data set of this study contains 101 current clinical early-stage hepatocellular carcinoma patients and proteomic data of 9254 protein characteristics.The training set is divided into2 proteomic subtypes according to prognostic characteristics such as postoperative recurrence or lethal risk.Through research methods such as data format conversion,optimization and adjustment of data processing methods,and evaluation and verification using different protein biomarker selection strategies,we found that the traditional random forest method is the best when used as a model training method and a validation set label prediction method.In this study,different methods such as differential protein screening,feature selection,and establishment of prediction models were used to screen features to form a combination of biomarkers containing multiple proteins.Use machine learning methods to predict sample typing labels on an independent validation set containing 159 samples and 6478 features,use cluster analysis to obtain typing labels for the biomarker combination,and perform survival analysis and enrichment on the typing results of the independent validation set Analysis and other statistical methods and biological verification have found that the biological function enriched by the protein biomarker combination we found is interpretable.It is hoped that the introduction of this topic can provide more research ideas and methods for finding tumor protein biomarkers.
Keywords/Search Tags:hepatocellular carcinoma, biomarkers, machine learning, survival analysis
PDF Full Text Request
Related items