Font Size: a A A

Construction Of Risk Prediction Model And Molecular Subtypes Based On Microarray And High-throughput Sequencing Expression Profile Data Of Acute Myeloid Leukemia

Posted on:2020-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y YangFull Text:PDF
GTID:2404330578979436Subject:Internal Medicine
Abstract/Summary:PDF Full Text Request
Background and purposeAcute myeloid leukemia(AML)is a highly heterogeneous hematologic malignancy and is the most common type of acute leukemia in adults.With the continuous exploration of the pathogenesis of AML at the cellular and molecular levels,new treatments and drugs for AML are constantly being renovated,which has improved the survival time and quality of life of AML patients.However,the long-term survival rate of AML patients is still low,and even some patients relapsed at early stage.Therefore,the optimal risk stratification method is particularly important for the rational selection of treatment and prognosis evaluation of AML patients.With the rapid development of high-throughput sequencing technology in the medical field,an increasing number of gene mapping and molecular features of tumor have been delineated,and molecular markers also play an increasingly important role in screening,diagnosis,monitoring,treatment response and prognosis evaluation of disease.Therefore,this study aims to construct a prognostic risk model and molecular subtypes based on microarray data and high-throughput expression profiling data of AML patients from multiple public databases.MethodsThe microarray data and high-throughput expression data of AML patients and corresponding clinical information were collected from TCGA,GEO and TARGET databases,and all data were at the same quantitative level after eliminate the batch effect between different platforms by using the ComBat method.The GPL570-GSE6891,GPL570-GSE37642,GPL96-GSE37642 and GPL570-GSE124177 datasets in GEO database were combined into a meta-dataset,and the dataset was randomly divided into a meta-training set and a meta-testing set in a ratio of 1:1.The GPL96-GSE12417,TCGA and TARGET datasets were used as independent validation datasets.In the meta-training set,an optimal gene signature was obtained after screening of log-rank test,univariate COX regression and LASSO-COX methods.AML risk score(AMLRS)model was established based on the linear weighting of gene expression value and corresponding covariates of COX regression.The optimal cutoff value calculated by the X-title method were used to divide AML patients into low-risk group and high-risk group.The time-dependent ROC analysis and Kaplan-Meier survival analysis were used in meta-training set to evaluate the prognostic predictive power of model,while being used to validate in the meta-testing set and 3 independent validation sets.Subgroup analysis was performed to further evaluate predictive power,and a nomogram was constructed for clinical application.At the same time,AML was divided into 7 molecular subtypes based on PAM(Partitioning Around Medoids)clustering algorithm,and the correlation of AMLRS,7 AML subtypes and clinical information was evaluated.ResultsA total of 1707 AML patient samples and 12272 genes were included in 3 public databases.After a series of screening,we finally screened a signature containing 10 survival-related genes,including ALDH2,FAM124B,NYNRIN,DNMT3B,DDIT4,SOCS2,ADGRG1,CALCRL,NDST1 and FHL1 genes.Through linear weighting of COX regression,we constructed an AML risk score model(AMLRS)and used an optimal cutoff value(1.47)to classify AML patients into low-risk and high-risk groups.In meta-training set,meta-testing set,and 3 independent validation datasets,the overall survival(OS)of the low-risk group was significantly higher than that of the high-risk group,and there was a statistically significant difference(P<0.001).The area under ROC curve(AUC)for the 1 year,3 years and 5 years was 0.5854-0.7905,0.6652-0.8066 and 0.6622-0.8034,respectively.Therefore,whether in the meta-training set or the validation set,AMLRS could distinguish well the low-risk group from the high-risk group and predicted the prognosis.We integrated AMLRS and related clinical parameters to construct an AML integrate risk score model(AMLIS)and divided AML patients into low-risk and high-risk groups by using cutoff value(4.85).In the AMLIS model,the OS of the low-risk group were still significantly higher than the high-risk group(P<0.001),suggesting that AMLIS also had a good prognostic performance.In addition,we compared AUC between cytogenetic risk stratification,AMLRS and AMLIS models.Although there was no statistical difference between 3 prognostic models,it can be seen that the AMLIS model was better than the AMLRS model and the cytogenetic risk stratification in prognostic prediction.In the subgroup analysis,the M3 subtype in the FAB classification and the favorable karyotype subtype in cytogenetic risk stratification had longer OS and lower risk scores,while the MO subtype and the unfavorable karyotype subtype had shorter OS and higher risk scores.Based on COX regression analysis of AMLRS and related clinical prognostic variables,AMLRS was proved to be an independent prognostic factor.AML patients was successfully divided into 7 molecular subtypes by PAM clustering using 1890 prognostic genes.Subtype 3 had the worst prognosis,while subtype 7 had the best prognosis.Subtypes 2,1 and 5 were relatively partial to subtype 3,with a relatively poor prognosis,while subtypes 4 and 6 were relatively partial to subtype 7,with a relatively good prognosis.ConclusionsBased on the microarray and high-throughput expression data of AML patients in public databases,we constructed a prognostic prediction model containing 10 survival-related genes and molecular subtypes with 7 subtypes,which will provide new ideas for prognosis,treatment and personalized management of AML patients.
Keywords/Search Tags:Acute myeloid leukemia, prognosis, risk model, molecular subtype
PDF Full Text Request
Related items