| Globally, breast cancer is the most frequently diagnosed cancer in women with an estimated 1.67 million new cases in 2012, accounting for 25.1% of all female malignant diseases. It has also been indicated that breast cancer ranks the top of cancer-related death worldwide in women. Epidemiological studies show that the incidence rate is about threefold higher of women in Europe and North Americacompared with Eastern Asia. Though, the current incidence of breast cancer is relatively low in China, it has been increased with an average rate of 3-5% in the past decades.Clearly, breast cancer is major public health concern that could be seriously threatening women’s life.It has been well accepted that breast cancer is multifactorial disorder caused by non-genetic factors and genetic factors.However, the effect of genetic variants is likely to vary the effects of environmental exposures and some other non-genetic risk factors. High-penetrancegermline mutations, BRCA1 and BRCA2, couldpredict a lifetime breast cancer risk of 60-80%, but they can explain only 15-20% of familial breast cancer and 5% of breast cancer overall. Therefore, more efforts are needed to identify low-penetrance variants which are largely linked to sporadic cases.Genetic variation has been proved to be a critical factor for discriminating cancer susceptible individuals. Emerging with the completion of Human Genome Project and advances in of genotyping technology, recent genome-wide association studies (GWASs) have identified numerous single-nucleotide polymorphisms (SNPs) associated with breast cancer risk in diverse populations. Among these SNPs, rs2046210, located between coiled-coil domain containing 170 (CCDC170, also called C6orf97) and estrogen receptor 1 (ESR1) at 6q25.1, was firstly reported to be associated with risk of breast cancer in Chinese populations. Subsequently, one intronic variant rs3757318 in CCDC170 and another intronic rs9383951 in ESR1 were also found to be associated with breast cancer risk. To date, numerous studies have confirmed these associations with breast cancer in this region, especially for rs2046210.The ESR1 gene is a strong candidate susceptibility gene for breast cancer in the 6q25.1 region (encoding estrogen receptor a), and studies have shown its implication in breast carcinogenesis. Nevertheless, the putative functions of this region are still undefined. Most of the SNPs mentioned above at 6q25.1 have mapped to introns or intergenic regions. In a previous study, a 41-kb block of the 6p25.1 region was systematically analyzed, and significant associations with breast cancer risk were observed for rs1038304, rs6929137, rs2046210 and rs10484919. However, these variants were all located at upstream of the ESR1 gene region. Hence, to evaluate the causal variants at 6q25.1 in the development of breast cancer, we screened the potentially functional variants at 6q25.1 within two genes (ESR1 and CCDC170) and assessed their associations with breast cancer risk in a case-control in Chinese women. We further evaluated potential biological functions of the SNPs that were found to be associated with breast cancer risk in the present study.In this study, we applied two approaches to select potential functional SNPs at 6q25.1. Firstly, we focused on those in linkage disequilibrium (LD) with the GWAS-identified SNP rs2046210 at this region and replicated the results in another independent sample. A total of 30 SNPs are in LD with rs2046210 (r2> 0.8), which were further functionally evaluated by SNPinfo [35] and expression quantitative trait loci (eQTL) analyses [36]. As a result, rs3983935 in CCDC170 was selected, because it is in strong LD with rs2046210 (r2=0.86) and was predicted to affect a potential binding site of miR-27a located in the 3’untranslated region (3’UTR) of CCDC170 and to regulate expression of CCDC170 in the eQTL analysis.For another approach, considering the existence of multiple independent breast cancer susceptibility loci at the 6q25.1 region and importance of ESR1 in breast cancer development, we also focused on potential functional SNPs of ESR1 (chr6:152160379-152466099). Potentially functional SNPs located in the coding (synonymous SNPs, missense SNPs and nonsense SNPs) and regulatory regions (promoter,5’UTR and 3’UTR) were selected. The SNPs were further filtered according to the LD analysis (r2<0.8) and minor allele frequency (MAF)≥ 0.05 in Chinese Han population. Finally, we included one SNP of CCDC170 (rs9383935) and five SNPs of ESR1 (rs488133, rs3798577, rs3798758, rs3798757 and rs2228480) in this study. Besides, the well-known SNP at 6q25.1, rs2046210, was also selected.Genomic DNA was isolated from leukocyte pellets of venous blood by proteinase K digestion and followed by phenolchloroform extraction. All of the DNA samples were checked for quality and quantity before genotyping. SNPs were genotyped by using IlluminaInfinium(?) BeadChip (Illumina Inc.).The call rate was ranging from 97.7% to 97.9% for six SNPs tested in all subjects.The biological function of the risk variant was further evaluated by laboratory experiments, including luciferase plasmids constructã€site-directed mutagenesisã€transient transfectionã€luciferase assaysã€transfection of miRNA and real-time quantitative reverse transcription PCR.Differences in demographic characteristics, selected variables and frequencies of alleles and genotypes between the cases and the controls on were analyzed by using the Student’s t-test (for continuous variables) and χ2 test (for categorical variables). Hardy-Weinberg equilibrium (HWE) for the genotype distribution of each SNP was evaluated using the goodness-of-fit χ2 test by comparing the observed genotype frequencies with the expected ones among the controls. Logistic regression analyses were employed to evaluate the associations between SNPs and the risk of breast cancer by estimating the odds ratios (ORs) and their 95% confidence intervals (CIs) with adjustment for potential confounders such as age, age at menarche and menopausal status. The heterogeneity of associations between subgroups was assessed using the x2-based Q-test. Differences in measurements of luciferase assays and miR-27a-3p transfection experiments between subgroups were examined using the t test. Differences in the expression levels of CCDC170 and ESR1 among GG, GA and AA genotypes of rs9383935 were assessed by nonparametric trend test. All of the statistical analyses were two-sided with P< 0.05 as the significant level and performed with Statistical Analysis System software (9.1.3; SAS Institute, Cary, NC, USA).Breast cancer risk was significantly associated with three SNPs located at 6q25.1: rs9383935 in CCDC170 and rs2228480 and rs3798758 in ESR1, with variant-allele attributed odds ratio (OR) of 1.38 (95%confidence interval (CI):1.20-1.57, P=2.21 x 10-6),0.84 (95% CI:0.72-0.98, P=0.025) and 1.19 (95% CI:1.04-1.37, P= 0.013), respectively. The functional variant rs9383935 is in high linkage disequilibrium (LD) with GWAS-reported top-hit SNP (rs2046210), but only rs9383935 showed a strong independent effect in conditional regression analysis.We further evaluated the associations of rs9383935, rs2046210, rs3798758 and rs2228480 with risk of breast cancer by subgroups of age, age at menarche and first live birth, menopausal status (premenopausal and natural menopausal) and subtype of breast cancer (ER and PR status). As shown in Table 2, the associations for rs9383935 and rs2046210 were significant in all the subgroups (all P< 0.05). Specifically, the association with rs3798758 was significant among women with a younger age, older age at menarche, and premenopausal women (P= 0.005,0.022, and 0.037, respectively). For rs2228480, a significant association was also observed in women with an older age at both menarche and the first birth of a child (P=0.005 and 0.048, respectively). In subtypes of breast cancer, rs3798758 was significantly associated with risk of ER-positive breast cancers (per-allele OR=1.21,95% CI:1.02-1.48, P= 0.030). Meanwhile, the rs2228480 A allele showed a protective effect regardless the ER/PR status. However, no heterogeneity was observed any strata of the subgroups.The rs9383935 risk allele A showed a decreased activity of reporter gene in both MCF-7 and BT-474 breast cancer cell lines, which might be due to an altered binding capacity of miR-27a to the 3’untranslated region (3’UTR) sequence of CCDC170. Real-time quantitative reverse transcription PCR confirmed the correlation between rs9383935 genotypes and CCDC170 expression levels. By using TCGA datasets, we also found rs9383935 G>A were associated with breast cancer overall survival. However, CCDC170 expression did not show a significant difference between tumor tissue and adjacent tissue in TCGA datasets. Furthermore, we also found that CCDC170 expression were in great difference among different subtypes of breast cancer, including Luminal ALuminal (ER+/PR+HER2-), Lumina B, Her2-enriched (ER-PR-HER2+) and triple-negative breast cancer (ER-PR-HER2-).Overall, the present study evaluated six potentially functional SNPs within the 6q25.1 region and confirmed that rs9383935, rs3798758 and rs2228480 were associated with breast cancer in Chinese women with 1064 cases and 1073 controls, and we also replicated rs2046210 in accordance with previous reports. Specifically, the CCDC170 rs9383935 showed the most prominent effect than any other variants of ESR1 or rs2046210, which provides new evidence for the role of the 6q25.1 region in breast cancer susceptibility. |