Font Size: a A A

A Long Non-coding RNA Signature’ In Triple-negative Breast Cancer Predicts Survival

Posted on:2017-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:C C PengFull Text:PDF
GTID:2404330488483898Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Background and purpose:Breast cancer is a leading cause of cancer death among women in the world.In China,the incidence of breast cancer is increasing at an annual rate of 3%.It is not only a threat to women’s health and life,but also is a serious problem facing our society.In the past decades,with the development of biotechnology,research progress in the mechanism of breast cancer remains slow.Breast cancer is a kind of heterogeneous disease involving multiple molecular alterations.Distinct biologic features and prognosis of breast cancer make clinical outcomes hard to predict and treatment imperfectly adapted,even though clinical manifestations are similar.Diverse histological classification and risk stratification for breast cancer are traditionally diagnosed on the basis of their clinicopathologic features.Based on the analysis of gene expression,breast cancer has been divided into four subtypes:cavity surface type(luminal A and B),basal type(basal hominins),ERBB2 sample and normal breast(normal-like).According to these classifications.suitable treatment of patients with breast cancer has been made and improves the survival rate in some extent,including endocrine therapy and HER2(human epidermal growth factor receptor-2,HER2)targeted therapy.Triple-negative breast cancer(TNBC)is one of the most lethal types of breast cancer,which have tested negative for estrogen receptor(ER)progesterone receptor(PR)and HER2.Compared with other types of breast cancer,TNBC is more aggressive,easy to relapse,with worse prognosis and higher mortality.TNBC patients can not benefit from endocrine therapy and HER2 targeted therapy for lacking ER PR and HER2.Treatment guidelines for TNBC have not been found yet and patients still receive conventional therapy of breast cancer.Although patients with TNBC prone to side effects and poor prognosis after chemotherapy,chemotherapy remains the main systemic treatment.Due to the high heterogeneity of TNBC,it is difficult to distinguish the patients reacted to certain chemical therapy,and there is no reliable screening biomarker.Therefore,in order to improve the outcomes of patients,there is an urgent need to find potential molecular markers for diagnosis and therapeutic targets of TNBC.With the development of high throughput technology,some genetic markers have been found to be considered as markers for predicting prognosis in breast cancer.These genetic markers are more sensitive and more specific than traditional clinicopathologic indexes.Nevertheless,the discovered genetic markers are not suitable for all people,but only a few of them can predict outcomes in patients with TNBC,such as Mammoprint and genomic grading index(GGI).In spite of that,there are still some limitations in clinical application of these genetic markers.Although IncRNA(long non-coding RNA,IncRNA)do not encode protein,its function is similar to that of RNA.There are approximately 410,000 IncRNA,amounting to about 80%~90%of total ncRNA.But lncRNA with known function is less than 1%of the total currently.Lots of IncRNA have been confirmed to be closely associated with the occurrence and development of various diseases,especially cancer.lncRNA not only can act as oncogenes or tumor suppressors,but also can regulate gene expression in level of epigenetics,transcription or after transcription.An increasing number of studies have found that lncRNA is disregulated in many cancers.In most cases,these abnormal lncRNAs are involved in a variety of malignant biological processes,including carcinogenesis,cell proliferation,apoptosis,migration,invasion and autophagy,which are correlated with cancer development.Therefore,IncRNA is an important class of candidate biomarker for diagnosis,treatment,pathological classification,and risk assessment,which may be helpful for clinical practice.Meanwhile,with extensive application of microarray technology,microarray expression data of online public database is increasingly growing,creating the condition for mining and analysis for huge amount of data.By using microarray technology,we could not only detect lncRNA expression,but also explore a new IncRNA related to prognosis through the annotation of gene chip probe.In this study,we aimed to profile the lncRNA expression signatures by analyzing two cohorts of previously published TNBC gene expression profiles from the Gene Expression Omnibus(GEO).We identified a six-lncRNA signature associated with survival,and then established a risk score formula using the expressions of these six lnCRNAs.The prognostic value of the signature was further confirmed in the testing cohort.Our findings suggest that IncRNA signatures can be predictive of clinical outcome and they may be useful as biomarkers.Methods:The first part:Preprocessing of TNBC gene expression data TNBC expression profile gene expression data and corresponding clinical data used in this study were obtained from the publicly available GEO databases,login GSE58812 and GSE12276 respectively.After filtering out samples without clinical survival information,there were a total of 178 samples,including 107 from GSE58812,71 from GSE12276,respectively.The raw CEL files were downloaded from GEO database and preprocessed using the Robust Multichip Average(RMA).A customized R scripts was used to perform a microarray expression calculation according to the re-mapping data(file ncrnamapperhgu133plus2cdf3.0).The second part:Identification and validation of lncRNA genes for survival prediction The 178 TNBC patients were randomly assigned to a training set(n=124)or a testing set(n=54)by R.The training data set was used for the detection of prognostic lncRNA genes.The association between the lncRNA gene expression and patient’s overall survival was assessed by univariable Cox regression analysis along with a permutation test using PAM(Prediction Analysis of Microarrays)2.23.First of all,each IncRNA Cox survival score was calculated and then the optimal score threshold was evaluated.All of the identified lncRNA were verified online in the ncRNA Expression Database(nred.matticklab.com).According to the optimal score threshold,supervised principal component predictor were built,and applied to the testing set,verifying the prognostic evaluation function of the model.The Kaplan-Meier method was used to estimate survival time for the training set and the testing set.Differences in survival times between the low-risk and high-risk groups in each set were then compared using the two-sided log rank test.The third part:Evaluation of IncRNA predict risk score performance To test whether IncRNA predict risk score was independent of patient age,tumor size,histological grade with the available data,univariable and multivariate Cox regression analysis and data stratification analysis were performed.We used receiver operating characteristic(ROC)curves to compare the sensitivity and specificity of the survival prediction of the IncRNA risk score.Area under the curve(AUC)values were calculated from the ROC curves.The significance was defined as p values being less than 0.05.Results:The first part:TNBC data sets and corresponding clinical data were downloaded from the publicly available GEO databases.After removal of the samples without survival status,a total of 178 patients were analyzed.These included 107 patients from GSE58812,71 patients from GSE12276.The second part:The 178 TNBC patients were assigned to a training set(n=124)or a testing set(n=54).The training set was used for the detection of prognostic lncRNAs.By subjecting the IncRNA expression data of the training set to univariable Cox proportional hazards regression analysis using the PAM 2.23,we identified a set of six lncRNAs that were significantly correlated with patients’ OS(P<0.0001).Of these,a positive coefficient indicated that a higher expression level of the gene(AK091525)was associated with shorter survival.The negative coefficients for the remaining four genes(AK126909,AF086008,BC013266,AK023400 and BC042889)indicated that their higher levels of expression were associated with longer survival.We created a supervised principal component prediction model according to the expression of these six IncRNAs for OS prediction.We then calculated the six-lncRNA signature risk score for each patient in the training set and a testing set separately,and ranked them according to their risk score.As such,patients were divided into a high-risk group or a low-risk group.Patients in the high-risk group had significantly lower OS rate than those in the low-risk group(logrank test P<0.0001).The third part:We tested whether the prognostic value of the six IncRNA signature was independent of age,tumor size,histological grade.For this,we first performed univariable and multivariable Cox regression analysis that included lncRNA risk score,age,and other clinical characteristics such as tumor size,histological grade(when available)as covariables.The results showed that the six-lncRNA risk score remained to be significantly associated with OS when adjusted by age and other variables in every cohort.Data stratification analysis was then performed,which stratified the two GEO(GSE58812 and GSE12276)patients(N=178)into a tumor size I stratum(>2.0 cm)or a tumor size Ⅱ stratum(<2.0 cm).This analysis showed that within each stage stratum,the six-gene risk score could further subdivide the patients into those likely to have longer survival and those likely to have shorter survival.We also performed receiver operating characteristic(ROC)analysis to compare the sensitivity and specificity of survival prediction between the six-lncRNA gene model,age,tumor size and histological grade on these patients.The AUROC of the six-lncRNA risk score was 0.879,which was significantly larger than that of age(AUROC=0.562,P>0.05)and histological grading(AUROC=0.525,P>0.05);when compared with tumor size,the AUROC of the six-lncRNA risk score was much the same(0.721 versus 0.879,P=0.001).These results indicated that six-lncRNA signature may have a better predictive ability than age,tumor size and histological grade.Conclusion:A set of six IncRNA genes(AK126909 AF086008 AK091525,BC013266,AK023400 BC042889)have been identified by bioinformatics tools such as R and PAM.Using a risk score based on the expression signature of these lncRNAs,we separated the patients into low-risk and high-risk groups with significantly different survival times in the training set and also validated in the testing set.These findings indicate that lncRNAs may be implicated in TNBC pathogenesis.The six-lncRNA signature may have clinical implications in the selection of high-risk patients for adjuvant therapy.
Keywords/Search Tags:Triple-negative breast cancer, IncRNA, Prognosis, Forecasting model, Biomarkers
PDF Full Text Request
Related items