Abstract
Here, comprehensive analysis was performed on the molecular and clinical features of colorectal carcinoma harboring chromosome 20q amplification. Tumor and normal DNA from patients with advanced colorectal carcinoma underwent next-generation sequencing via MSK-IMPACT, and a subset of case samples was subjected to high-resolution microarray (Oncoscan). Relationships between genomic copy number and transcript expression were assessed with The Cancer Genome Atlas (TCGA) colorectal carcinoma data. Of the colorectal carcinoma patients sequenced (n = 401) with MSK-IMPACT, 148 (37%) had 20q gain, and 30 (7%) had 20q amplification. In both the MSK-IMPACT and TCGA datasets, BCL2L1 was the most frequently amplified 20q oncogene. However, SRC was the only recognized 20q oncogene with a significant inverse relationship between mRNA upregulation and RAS/RAF mutation (OR, −0.4 ± 0.2, P = 0.02). In comparison with 20q diploid colorectal carcinoma, 20q gain/amplification was associated with wild-type KRAS (P < 0.001) and BRAF (P = 0.01), microsatellite stability (P < 0.001), distal primary tumors (P < 0.001), and mutant TP53 (P < 0.001), but not stage. On multivariate analysis, longer overall survival from the date of metastasis was observed with chromosome 20q gain (P = 0.02) or amplification (P = 0.04) compared with diploid 20q.
Implications: 20q amplification defines a subset of colorectal cancer patients with better overall survival from the date of metastasis, and further studies are warranted to assess whether the inhibition of 20q oncogenes, such as SRC, may benefit this subset of patients. Mol Cancer Res; 15(6); 708–13. ©2017 AACR.
Introduction
Gains and amplifications of chromosome arm 20q, which harbors multiple potential oncogenes, including BCL2L1, AURKA, and SRC, have been implicated as an important oncogenic event in colorectal carcinoma (1–12). Despite multiple functional studies of various 20q genes, 20q gain in colorectal carcinoma remains underinvestigated in relation to other key molecular features of colorectal carcinoma (2) and clinical features including prognosis and response to treatment. In this study, we aimed to further define the molecular and clinical features of colorectal carcinoma harboring 20q gain.
Materials and Methods
Case selection
After approval by our Institutional Review Board, the molecular results and electronic medical records from all patients with colorectal carcinoma who underwent MSK-IMPACT molecular testing from January 1, 2014, to May 31, 2016, were reviewed. Patients with colorectal carcinoma whose tumors are submitted for MSK-IMPACT testing at our institution generally have advanced (distant metastatic) disease and are being considered for anti-EGFR therapies that require wild-type KRAS, NRAS, and BRAF tumor status. Only one sample per patient was used for analyses in this article. Minimum required tumor purity was 10% for MSK-IMPACT cases.
Molecular testing
MSK-IMPACT is a hybrid capture–based next-generation sequencing assay that interrogates the entire coding region and select noncoding regions of 410 genes to determine somatic mutations, copy number alterations, and structural variants from tumor and matched normal samples (13). We calculated the tumor/normal log2 ratio (lr) for all 20q genes in the panel and then stratified cases into three groups: amplification (lr ≥ 0.95), gain (lr, 0.45–0.95), and diploid (lr < 0.45). Hotspot mutations in KRAS (G12, G13, Q61, K117, A146, K147), NRAS (G12, G13, Q61), or BRAF (V600E) were recorded, along with any coding mutations in all other genes within the panel. Microsatellite instability (MSI) status was assessed via MSIsensor, a program that assesses variations in repeat areas throughout the genome comparing the tumor with the normal (14). Samples with MSIsensor scores greater than 10 are considered MSI-H. This program has been clinically validated for the evaluation of MSI-high (MSI-H) versus microsatellite stable in MSK-IMPACT data. A select group of cases with the highest level of chromosome 20q amplification was assessed by genome-wide SNP microarray, Oncoscan, to allow a higher resolution, allele-specific analysis of the amplified region.
TCGA data analysis
To further investigate the prevalence and potential drivers of chromosome 20q amplification in colon cancer, we obtained level 3 data available from The Cancer Genome Atlas (TCGA) colorectal adenocarcinoma (COADREAD) cohort (15). Somatic mutations and RNASeqV2-normalized expression data were obtained using the R/Bioconductor package TCGAbiolinks (16). GISTIC2.0 gene and arm level copy number data were obtained from The Broad GDAC Firehose (17, 18). Samples were stratified by chromosome arm 20q absolute copy number (acn) into the following groups: amplified (acn ≥ 4), gain (acn, 2.5–4), diploid (acn,1.5–2.5), and loss (acn < 1.5). Gene level copy number calls were pulled directly from GISTIC2.0 output. As RNA sequencing (RNA-Seq) control samples are limited in TCGA data, the gene expression values (e) for each sample were normalized by calculating the mean (μ) and variance (σ) of the expression values for samples in which the gene was copy number diploid. Sample level gene expression (z-score) was then calculated as (e − μ)/σ. Sample z-score values of >2 and <−2 were used as thresholds for gene upregulation and downregulation, respectively.
Statistical analysis
Associations with clinical and molecular data were assessed by Fisher test with multiple hypothesis testing correction (Benjamini–Hochberg, α = 0.05). To assess survival, a Cox proportional hazards model was fitted to the data. Here, the covariates of chromosome arm 20q copy number status, RAS/BRAF mutation, MSI status, age at diagnosis, and pathologic stage were each assessed through both univariate and multivariate Cox regressions. This analysis was repeated using the available molecular and clinical data available from TCGA. For all chromosome arm 20q genes, Pearson correlations were calculated between log2 copy number and log2-transformed normalized expected counts (RSEM) from RNA-Seq experiments, also available through the TCGA.
Results
Incidence and clinical characteristics of chromosome 20q gain/amplification
We screened for chromosome arm 20q copy number alterations in 413 prospectively sequenced patient samples (401 patients). Of 401 consecutive cases of advanced colorectal carcinoma undergoing MSK-IMPACT testing results, 148 (37%) had 20q gain, and 30 (7%) had 20q arm level amplification. Ten patients had MSK-IMPACT testing on both primary and metastasis, and the concordance for 20q copy number status between paired samples was 100% (2 patients with gain of chromosome 20q and 8 patients with diploid chromosome 20q).
Furthermore, we selected 14 cases with high levels of 20q identified with MSK-IMPACT and analyzed them by Oncoscan to confirm MSK-IMPACT results and to assess whether DNA levels on 20q were preferentially higher at specific cytogenetic loci or genes. Oncoscan analysis revealed that 20q gains and amplifications are broad, without focal changes or discontinuity. In addition, the “major clones” rather than subclonal populations harbored 20q amplification in the cases studied, suggesting that 20q amplification may take place relatively early in carcinogenesis. The gains/amplifications started between 20p11.2 and 20q11.2 and included all of the long arms of chromosome 20 through qter (Supplementary Fig. S1; Supplementary Table S1).
The distribution of right- to left-sided (right sided:left-sided) colorectal carcinoma was 1:9 (n = 30) in colorectal carcinoma with chromosome 20q amplification, 1:4.5 (n = 148) in colorectal carcinoma with 20q gain, and 1:1.1 (n = 218) in patients with diploid chromosome 20q (Fisher P value < 0.001; Table 1). There was no significant difference in stage distribution at diagnosis of colorectal carcinoma in patients with 20q amplified versus 20q diploid tumors in either MSK-IMPACT and TCGA data (Supplementary Table S2).
Molecular signature according to chromosome 20q status in MSK-IMPACT dataset
Molecular signature of colorectal carcinoma with 20q gain/amplification
As chromosome 20q amplifications are often implicated as an oncogenic event, we looked for patterns of comutation and mutual exclusivity of 20q alterations with other known driver events. KRAS and BRAF mutations were significantly associated with diploid chromosome 20q (P < 0.0001 and P = 0.01 for KRAS and BRAF, respectively) in the MSK-IMPACT data. Furthermore, the relationships between RAS/RAF mutation and 20q in MSK-IMPACT showed a higher mutual exclusivity with increased copy number of 20q, indicating a dose dependency (5% of colorectal carcinoma with 20q amplification had RAS/RAF mutations, 18% of colorectal carcinoma with 20q gain had RAS/RAF mutations, and 34% of colorectal carcinoma with diploid 20q had RAS/RAF mutation) (Table 1; Fig. 1). Hotspot mutations in KRAS (G12, G13, Q61, K117, A146, K147), NRAS (G12, G13, Q61), or BRAF (V600E) were present in 10% of colorectal carcinoma with 20q amplification, 45% of colorectal carcinoma with 20q gain, and 72% of colorectal carcinoma with diploid 20q (Table 1).
Molecular correlates of 20q amplified or gained MSK-IMPACT colorectal carcinoma. Increased copy number (CN) of chromosome 20q showed a mutually exclusive relationship with KRAS, NRAS, and BRAF mutations, MSI-H status, and a higher incidence of TP53 and APC mutations.
Further analysis of the MSK-IMPACT data revealed relationships with MSI status, APC mutations, and TP53 mutations. MSI-H status was significantly associated with diploid chromosome 20q (P < 0.0001). MSI-H status was present in none of the 30 cases with 20q amplification, 1 (<1%) of 148 cases with 20q gain, and 30 (13%) of the 223 cases with diploid 20q. APC mutations were associated with gain or amplification of chromosome 20q (P = 0.04). APC mutations were present in 152 (68%) of colorectal carcinoma with diploid 20q, 121 (82%) of colorectal carcinoma with gain of 20q, and 23 (77%) of colorectal carcinoma with amplification of 20q. TP53 mutations were significantly associated with 20q gain or amplification (P < 0.0001). TP53 mutations were present in 136 (61%) of colorectal carcinoma with diploid 20q, 124 (84%) of colorectal carcinoma with gain of 20q, and 28 (93%) of colorectal carcinoma with amplification of 20q. These findings are summarized in Table 1.
Overall survival of patients with advanced colorectal carcinoma by 20q copy number status
There were 354 colorectal carcinoma patients with MSK-IMPACT testing with distant metastases and available clinical follow-up for survival analysis. This included 187 patients with diploid 20q, 139 patients with 20q gain, and 28 patients with amplified 20q. Chromosome 20q copy number status, RAS/RAF mutation, MSI status, pathologic stage, and age at diagnosis were included as covariates to a Cox proportional hazards model. On multivariate analysis, chromosome 20q gain [P = 0.015; confidence interval (CI), 0.36–0.90] or amplification (P = 0.039; CI, 0.11–0.94) was associated with longer overall survival (Fig. 2), and RAS/RAF mutation was associated with shorter overall survival that did not reach statistical significance (P = 0.064; CI, 0.97–2.46).
Overall survival from the date of diagnosis with metastasis stratified by chromosome 20q status. Chromosome 20q gain or amplification was associated with longer overall survival (P = 0.004; CI, 0.34–0.82).
Of the 366 colorectal carcinoma patients analyzed by the TCGA, only stage IV disease at diagnosis was associated with worse survival (HR, 4.8; 95% CI, 1.73–12.80) on multivariate analysis. RAS/RAF mutation, MSI status, and 20q status were not significantly associated with survival in TCGA data. However, only 47 patients in the TCGA data had metastatic disease at diagnosis (stage IV), whereas 240 patients with MSK-IMPACT data had metastatic disease at diagnosis (stage IV), reflecting the very different criteria for patient selection in these two datasets.
Response to anti-EGFR therapy of patients with 20q gain/amplification
Nineteen patients with metastatic colorectal carcinoma and 20q gain/amplification on MSK-IMPACT were treated with EGFR inhibitors (cetuximab/panitumumab). Of these patients, 11 had adequate follow-up and received EGFR inhibitors either alone or in combinations where response to the EGFR inhibitor could be assessed (e.g., combination with chemotherapy after progression to that chemotherapy; Supplementary Table S3). Review of radiology records suggested that of these 11 patients, 3 patients had decreased disease, 1 patient had unchanged disease, 1 patient had mixed response, and 6 patients had increased disease after cetuximab or panitumumab therapy.
Correlation between TCGA chromosome 20q oncogene DNA copies and mRNA levels
In an effort to identify a potential driver gene within chromosome arm 20q, we integrated the somatic copy number and whole transcriptome RNA-Seq data from TCGA. MSK-IMPACT genes on 20q that showed amplification (in order of frequency) included BCL2L1, ASXL1, SRC, DNMT3B, GNAS, TOP1, AURKA, PTPRT, and NCOA3. A similar pattern of amplification frequency was observed in TCGA data (Supplementary Table S4), with slightly higher percentages for colorectal carcinoma showing amplification due to higher tumor purity in TCGA data (Supplementary Fig. S2)
The correlation coefficient (r) value for TCGA data between DNA copy number and mRNA levels of potential oncogenes on chromosome 20q was 0.51 for TPX2, 0.65 for BCL2L1, 0.59 for SRC, 0.53 for AURKA, and 0.42 for GNAS. Given that our analysis showed a highly statistically significant mutually exclusive relationship between 20q amplification and RAS/RAF mutations suggesting that this alteration may play a similar driver role in colorectal carcinoma, we next examined which 20q amplified genes were most highly expressed in the absence of these well-known driver mutations. This analysis showed that SRC mRNA upregulation had the strongest mutually exclusive relationship with KRAS/NRAS/BRAF hotspot mutations (OR: −0.4 ± 0.2, P = 0.02), whereas the relationships of mRNA upregulation versus KRAS/NRAS/BRAF for other potential 20q oncogenes did not reach statistical significance. These findings are illustrated in Fig. 3.
Correlation of copy number and mRNA expression of potential chromosome arm 20q oncogenes in TCGA COADREAD cohort and log OR of RNA expression upregulation and KRAS/NRAS/BRAF mutant. SRC mRNA upregulation was significantly associated with wild-type RAS/RAF (P = 0.02; CI, −0.36 to ±0.24).
Discussion
In this study, we show that gain or amplification of chromosome 20q11-13.3 occurs in approximately 37% and 7% of advanced colorectal carcinoma, respectively, that chromosome 20q gain/amplification is associated with molecular and clinical findings, such as a 20q “dose-dependent” RAS/RAF wild-type status, left-sided primary tumors, MSS, a higher incidence of mutations in TP53 and APC, and longer overall survival in patients with metastatic disease. These findings support the role of chromosome 20q amplification as a driver in colorectal carcinoma.
Previous studies have observed that chromosome 20q is recurrently amplified in colorectal carcinoma (1–12). One study of 133 colorectal carcinoma patients identified a statistically significant inverse correlation between “PI3K” pathway mutations, which included PIK3CA as well as KRAS and BRAF, versus 20q amplification (2). Some studies have reported that 20q amplification is consistent between primary and metastatic samples from the same patient (4, 10) and occurs as an early event in invasive colorectal carcinoma (3). Other studies have found that 20q amplification is more common in colorectal carcinoma patients with liver metastases (2, 11), lung metastases (4), or within metastatic foci than in the primary tumor (12). Chromosome 20q gain/amplification is highly concordant between primary and metastasis in both our data and the above cited article, and its positive selection in primary and metastatic samples argues for its role in the development of metastases to different organs.
We have found complete concordance for 20q DNA copy number in cases where both the primary and metastasis were analyzed. In addition, allele-specific copy number analysis using the Oncoscan SNP array platform showed that 20q amplification occurs as a clonal rather than a subclonal event, much like hotspot driver mutations in KRAS, NRAS, or BRAF and that there is no difference in the proportion of cases with 20q amplification between different diagnostic stages. Furthermore, a higher degree of mutual exclusivity between RAS/RAF mutation and 20q level was identified (i.e., the higher the level of 20q amplification, the less likely a tumor was to have a KRAS, NRAS, or BRAF mutation). These data suggest that 20q amplification occurs early rather than late in colorectal carcinoma as a “dose-dependent” driving event.
A “driver” mutation has been defined as a mutation that gives a growth advantage to the cell(s) harboring it and is selected for as a cancer evolves (19). In the broad sense of the term, an individual cancer may have 5 to 20 “driver” events, representing functionally significant mutations. These include both tumor suppressor gene mutations, such as TP53 and APC, as seen in the majority of colorectal carcinoma, as well as oncogene events that cooccur with tumor suppressor alterations. Strong mitogenic drivers (usually RTK/RAS/MAPK pathway) are usually mutually exclusive, and 20q amplification follows this pattern. The facts that 20q is amplified with high concordance between primary and metastasis and that it shows a dose-dependent inverse relationship with RAS/RAF mutations suggest that amplification of certain gene(s) on 20q is important for the process of metastasis and may have growth advantages that overlap with those provided by RAS/RAF mutations.
Although chromosome 20q gain/amplification is an important recurrent alteration in colorectal carcinoma, attempts to identify the “driving” oncogene(s) within the amplified segment by standard cancer genomic approaches, such as definition of a minimal common region of amplification and correlations of copy number and expression, have been unsuccessful (1). Both in our data and published studies, the amplified segment is consistently broad, and amplification does not correlate well with RNA upregulation for many genes on 20q (20, 21), with R values ranging from 0.42 to 0.65 for DNA amplification versus RNA upregulation in our analysis of TCGA data for common oncogenes (Fig. 2). We therefore used a different analytic approach, reasoning that, if 20q amplification is functioning as a colorectal carcinoma driver similar to RAS/RAF mutations and other kinase alterations recently described in colorectal carcinoma (22), then the 20q amplified genes with the strongest inverse expression correlation with RAS/RAF status may be the most likely drivers. We therefore examined how 20q amplified genes were expressed in relation to the presence or absence of RAS/RAF mutation and found that SRC had the strongest, and only statistically significant, mutually exclusive relationship between mRNA upregulation and RAS/RAF mutation in the TCGA data. This suggests that SRC amplification may serve as a mechanism for SRC upregulation and that SRC may be an important driving oncogene in the subset of colorectal carcinoma with 20q amplification. SRC encodes a nonreceptor protein kinase that has been linked to cancer progression and metastatic disease. It is downstream of EGFR, yet upstream of both the PI3K and MAPK pathways. Several drugs targeting SRC are available, including dasatinib (23, 24), and functional studies of colorectal carcinoma cell lines or patient-derived xenografts with 20q amplification are needed to assess whether the group of patients in this study may benefit from anti-SRC therapy.
Other recognized oncogenes on chromosome 20q include AURKA and TPX2. AURKA overexpression, located at 20q13.13, has been observed in some colorectal carcinoma (25). It has also been shown that TPX2, located at 20q11.21 near BCL2L1, interacts with AURKA (19). It has been hypothesized that these genes act on the same pathway of tumor progression and may have a gene dosage effect in which amplification of both genes has a larger effect than amplification of just one or the other. AURKA and TPX2 have been shown to induce anchorage-independent growth (an in vitro measure of metastatic potential) and to coregulate MYC; specifically, AURKA and TPX2 overexpression stabilizes and induces MYC (26).
The limitations of this study include the fact that the effect of 20q amplification on the response to specific chemotherapies could not be assessed in patients due to multiple concomitant lines of chemotherapy given and pretreatment. Only 11 patients who received anti-EGFR therapy had clinical courses that could be analyzed. Separately, we did not have expression data for MSK-IMPACT cases as material is often limited for biopsy specimens, so DNA copy number versus RNA transcription levels could not be compared. In addition, tumor purity requirements were lower for MSK-IMPACT (≥10%) compared with TCGA (≥50%) cases. This difference in tumor purity would result in an underestimate of 20q gain/amplification due to dilution with the surrounding nontumor DNA.
In summary, we show that chromosome 20q gain/amplification drives and defines a subset of colorectal carcinoma with distinct clinical and molecular findings in a dosage-dependent fashion. Oncogenes on chromosome 20q, such as SRC and AURKA, may serve as potential targetable alterations with available selective inhibitors.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: C. Pagan, D.S. Klimstra, L. Saltz, M. Ladanyi, A. Zehir, J.F. Hechtman
Development of methodology: R.N. Ptashkin, C. Pagan, L. Wang, A. Zehir, J.F. Hechtman
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C. Pagan, R. Yaeger, J. Shia, L. Wang, R. Cimera, J. Wang, L. Saltz, A. Zehir, J.F. Hechtman
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): R.N. Ptashkin, C. Pagan, R. Yaeger, S. Middha, M.F. Berger, L. Wang, R. Cimera, A. Zehir, J.F. Hechtman
Writing, review, and/or revision of the manuscript: R.N. Ptashkin, C. Pagan, R. Yaeger, S. Middha, J. Shia, K.P. O'Rourke, L. Wang, D.S. Klimstra, L. Saltz, M. Ladanyi, A. Zehir, J.F. Hechtman
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): R.N. Ptashkin, C. Pagan, J. Shia, R. Cimera, J. Wang
Grant Support
This study was funded by the NCI under the MSK Cancer Center Support Grant/Core Grant (P30 CA008748).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Footnotes
Note: Supplementary data for this article are available at Molecular Cancer Research Online (http://mcr.aacrjournals.org/).
- Received October 13, 2016.
- Revision received October 13, 2016.
- Accepted January 26, 2017.
- ©2017 American Association for Cancer Research.