An Integrative Approach Uncovers Biomarkers that Associate with Clinically Relevant Disease Outcomes in Vulvar Carcinoma

Vulvar squamous cell carcinoma (VSCC) is a rare disease that has a high mortality rate (∼40%). However, little is known about its molecular signature. Therefore, an integrated genomics approach, based on comparative genome hybridization (aCGH) and genome-wide expression (GWE) array, was performed to identify driver genes in VSCC. To achieve that, DNA and RNA were extracted from frozen VSCC clinical specimens and examined by aCGH and GWE array, respectively. On the basis of the integration of data using the CONEXIC algorithm, PLXDC2 and GNB3 were validated by RT-qPCR. The expression of these genes was then analyzed by IHC in a large set of formalin-fixed paraffin-embedded specimens. These analyses identified 47 putative drivers, 46 of which were characterized by copy number gains that were concomitant with overexpression and one with a copy number loss and downregulation. Two of these genes, PLXDC2 and GNB3, were selected for further validation: PLXDC2 was downregulated and GNB3 was overexpressed compared with non-neoplastic tissue. By IHC, both proteins were ubiquitously expressed throughout vulvar tissue. High expression of GNB3 and low PLXDC2 immunostaining in the same sample was significantly associated with less lymph node metastasis and greater disease-free survival. On the basis of a robust methodology never used before for VSCC evaluation, two novel prognostic markers in vulvar cancer are identified: one with favorable prognosis (GNB3) and the other with unfavorable prognosis (PLXDC2). Implications: This genomics study reveals markers that associate with prognosis and may provide guidance for better treatment in vulvar cancer. Mol Cancer Res; 14(8); 720–9. ©2016 AACR.


Introduction
Vulvar squamous cell carcinoma (VSCC) is an uncommon disease, constituting 3% to 5% of all malignancies in the female genital tract (1,2). VSCC develops primarily in elderly women, after their 70s; however, the number of cases in younger patients has been climbing, likely due to human papillomavirus (HPV) infection (2).
Notably, certain genomic alterations do not affect RNA or protein expression (17). Thus, another technique with wide coverage, such as genome-wide expression (GWE) array, is used to identify driver alterations.
Only one study (18) has used a multimodal approach to compare copy number alterations (CNA) and GWE in VSCC. The study was based on the frequency of CNAs to further compare with GWE. This method has several limitations: the size of the region might contain a large number of genes, and this approach alone fails to determine the actual influence of CNA on changes in gene expression (19). Micci and colleagues selected 2 genes, based on recurrent losses on 3p and 9p by aCGH (FHIT and PTPRD), and 5 genes (MAL, KRT4, OLFM4, SPRR2G, and S100A7A), based on expression array alterations. However, none of the gene abnormalities was associated with clinical or pathologic variables.
In this study, we examined a carefully selected subset of VSCC by integrating aCGH and GWE data using validated statistical methods to improve our understanding of vulvar cancer. Our integrative analysis identified 47 candidate drivers for VSCC. Two genes (PLXDC2 and GNB3) were validated and demonstrated a high correlation with the prognosis.

Case selection
Seventeen DNA samples of VSCC fresh frozen tissues, with paired RNA (tumoral and adjacent normal tissues), from the Tumor Biobank and 150 formalin-fixed paraffin-embedded (FFPE) samples from the Anatomic Pathology Department, AC Camargo Cancer Center (São Paulo, Brazil), were acquired. DNA and RNA of the frozen tissues were used for aCGH, gene expression, and as well for validation. The amount of quality samples did not allow an independent validation. All samples were obtained from patients who underwent surgery at this institution between 1980 and 2008, and none received neoadjuvant therapy. Furthermore, an experienced pathologist in vulvar diseases carefully reviewed all specimens before the molecular and immunohistochemical analysis.
Clinical data were obtained from the medical records of this institution, and this study was approved by the institutional ethics committee (Protocol # 1623/11). HPV was analyzed using the Linear Array HPV Genotyping Test Kit (Roche Molecular Diagnostics), and the entire technique has been detailed by our group (6).

DNA and RNA extraction
Total RNA was extracted from macrodissected frozen tissue. Briefly, tissue samples were dissected and lysed, and total RNA was extracted using the RNeasy Mini Kit (QIAGEN) and a Precellys per the manufacturer's instructions to increase the yield of the reaction.
The DNA pellets from the tissue lysis step were incubated with Cell Lysis Solution (Gentra Puregene Blood Kit, QIAGEN) for 18 hours in a thermomixer at 55 C. Then, the reactions were centrifuged, and phenol-chloroform-isoamyl alcohol was added. The reactions were centrifuged sequentially with glycogen at 20 mg/mL and 100% and 70% EtOH. The EtOH was removed, and samples were dried and resuspended in distilled water.
After the quality and quantity of the DNA and RNA were measured, samples with high integrity and quality were used for further analysis.

aCGH
A total of 400 ng of tumor DNA and normal commercially available DNA (Promega) was differentially labeled using the Genomic DNA Enzymatic Labeling Kit (Agilent Technologies). The hybridizations were performed using the SurePrint G3 Human CGH Microarray Kit, 8 Â 60K (Agilent Technologies) as per the manufacturer's recommendations. Acquisition of array images and data evaluation were performed as described by Cirilo and colleagues (2013;ref. 20). CNAs were evaluated prior to data integration by NEXUS copy number software version 5.0 (Bio-Discovery). For this analysis, the threshold used was 1 Â 10 À5 with a minimum of 5 consecutive probes altered. Gains were considered when greater than 0.3; high gain (> 0.8); losses ( 0.3), and homozygous loss those lower than À1.2.

GWE
Gene expression profiles were evaluated using the SurePrint G3 Human GE 8 Â 60K Microarray Platform (Agilent Technologies) as per the manufacturer's protocol. Tumor samples were compared with a pool of 11 adjacent normal tissue samples. The slides were scanned on a DNA microarray scanner with SureScan High-Resolution Technology (Agilent Technologies), and the data were extracted using Feature Extraction v 11.0.1.1 (Agilent Technologies), build 37. TMeV (version 4.8) and R (version 2.15) were used to analyze the gene expression data. Raw data from the array scans were normalized by median-centering of the genes for each array and log2-transformed. Also, probes with low reproducibility were removed using a filter.

Integrative analysis
The genomic and transcriptomic data were integrated to identify driver genes and the processes that they influence. These genes were selected using the CONEXIC algorithm, which combines CNAs and gene expression data and constructs regulatory networks, based on the driver genes that appear (19). Also, the algorithm uses a Bayesian function to detect modulator candidates (drivers) among the regions with amplifications and deletions. The ranked score reflects how well a driver candidate predicts the behavior of a module-higher scores increase the likelihood of a gene having an adaptive advantage on the tumor phenotype. The genes that were selected through the integrative analysis were analyzed with regard to functional enrichment using Ingenuity Pathway Analysis (IPA, www.qiagen.com/ingenuity).
Quantitative analysis of DNA copy number (qPCR) and gene expression (RT-qPCR) DNA copy number was measured using two primer pairs that flanked two regions that were covered by aCGH probes for PLXDC2 and GNB3. The primers were designed using OligoTech, version 1.00 (Oligos Etc. Inc.; Therapeutics Inc). Two endogenous genes were used for normalization: HPRT and GAPDH. The PLXDC2 and GNB3 primer sequences and conditions are described in Supplementary File S1.
The reactions were performed in a MicroAmp Optical 96-Well Reaction Plate (Life Technologies) on an Applied Biosystems 7900HT Fast Real-Time PCR System (Applied Biosystems). Each reaction, performed in triplicate, contained 10 mL Power SYBR Green PCR Master Mix (Life Technologies), 9 mL distilled water, and 0.6 mL each of the forward and reverse primers. Each DNA sequence was quantified and normalized to both endogenous genes. The thresholds for defining a fragment as unaltered (0.5-1.50) and involved in losses (<0.50) and gains (>1.50) were established from the relative copy number values that were obtained in the same control sample for the aCGH.
Gene expression was analyzed by RT-qPCR per the MIQE guidelines (21). cDNA was obtained after reverse transcription of total RNA from tumor and adjacent nontumor samples using the High-Capacity cDNA Reverse-Transcription Kit (Applied Biosystems). Assays that covered the probes included in SurePrint G3 Human GE 8 Â 60K Microarray Platform (Agilent Technologies) for PLXDC2 and GNB3 were used for gene expression validation among 2 endogenous genes (GAPDH and HPRT; Hs00929702_m1, Hs01564088_m1, Hs99999905_m1, Hs02800695_m1, respectively). We used the TaqMan (Applied Biosystems) method as per the manufacturer's protocol on a 7900HT Fast Real-Time PCR System (Applied Biosystems). The relative quantification was calculated using the model that was proposed by Pfaffl (2001;ref. 22).
Antigen retrieval was performed in a pressure cooker for 15 minutes in Tris-EDTA pH 9.0. The reactions were visualized with DAB (3,3'-diaminobenzidine) for 5 minutes and counterstained with hematoxylin for 1 minute. All reactions included positive (tonsil and or skin cancer for PLXDC2 and colorectal cancer for GNB3) and negative (omission of primary antibody) controls. Tumoral adjacent normal tissue was submitted to same protocol of staining regarding PLXDC2 and GNB3. The slides were scanned on a Pannoramic 250 Flash II (3DHISTECH Kft), and the stains were quantified using Pannoramic Viewer (3DHISTECH Kft) with the DensitoQuant module, which provides an H-Score from 0 to 300. With regard to survival, scores were categorized as having low (1-99), moderate (100-199), and high expression (200-300). Inverse expression patterns of GNB3 and PLXDC2 were compared (i.e., high GNB3/low PLXDC2 vs. low GNB3/high PLXDC2) and low expression was considered when H-Score < 150 for both proteins.

Statistical analysis
Clinical and pathologic variables and qPCR, RT-qPCR, and IHC results were compared by Mann-Whitney test after Kolmogorov-Smirnov test demonstrate that they required a nonparametric approach. c 2 test was used to compare high GNB3/low PLXDC2 with low GNB3/high PLXDC2 expression with clinical and pathologic data. The Kaplan-Meier method was used to determine survival rates, and log-rank test was used to calculate RR. Statistics analyses were performed using SPSS version 21.0.0.0 (IBM), considering P 0.05 as significant.

Sample characterization
Eighteen VSCC samples were examined by aCGH analysis, 66.7% of which were classified as VSCC grade 1 or 2 and 72.2% of which had FIGO stage I or II. Furthermore, 72.2% of patients had no perineural or vascular invasion, and 55.6% had no nodal metastasis. Most subjects were classified as high-risk HPV-positive (66.7%).

CNAs
Two out of 18 samples used for aCGH analysis did not show significant CNAs. The remaining 16 showed a mean of 234 alterations per sample, ranging from 18 to 573. Events of copy number gains were more frequent than losses; however, all the 16 samples had both events concomitantly. Eleven of them exhibited events of high copy number gains and 4 presented homozygous copy loss (Table 1).
Twenty-eight copy numbers were significantly altered: 22 gains, harboring 506 genes, and 6 losses, encompassing 27 genes. The average length of the alterations was 760 kbp.
The frequency of gains and losses varied from 22% to 55% of samples. Gains in 3q27.
Integrative analysis: Driver candidate selection Forty-seven driver candidates were selected by the CONEXIC algorithm as the top-ranked genes ( Table 2). Only PLXDC2 (mapping to 10p12.31) had a deletion that was concomitant with downregulation. The other genes had copy number gains and increased expression.
The expression data of the 47 possible drivers was subjected association between them and the clinical and pathologic variables evaluated (Supplementary files S3 and S4).
All 47 genes were subjected to functional enrichment analysis using IPA, revealing the top three networks, associated with carbohydrate metabolism, cellular maintenance, RNA posttranscriptional modification, and organ development (Fig. 1). The pathways and functions that correlated with these genes were examined and confirmed using KOBAS 2.0, as summarized in Supplementary File S5.
The evaluation of gene expression data revealed that only 19 of them were somehow associated with at least one of the clinical and pathologic variables, being UNC93B1 the gene with the greater number of associations. Therefore, this gene did not appeared in any of our in silico analysis, and it was discharged for further validation.
The following strategy was to evaluate those genes with greater number of clinical associations and along with GNB3 others appeared showing possible relevance in VSCC (Supplementary files S3 and S4). However, of those with higher number of associations with the variables only GNB3 had association with lymph node metastasis, the best prognostic marker so far stablished. In addition, it was present in the in silico evaluation demonstrating its potential in tumoral process. Thus, it was selected for further validation along with PLXDC2 that presented association with recurrence and more importantly was the only gene that presented loss of copy number associated with gene downregulation.

Data validation by qPCR and RT-qPCR
On the basis of our results, PLXDC2 had one of the primer pair that corroborated the aCGH findings, with copy number loss (0.197) and the second was normal (0.616). GNB3 was quantified, with mean copy number ranging from 0.619 to 0.693 (Suplementary File S5). The only significant association between the qPCR results and clinicopathologic features was the correlation of a higher number of copies of the first sequence evaluated of GNB3 with the presence of lymph node metastasis (P ¼ 0.03; Table 3).

IHC
GNB3 and PLXDC2 were homogeneously expressed throughout the cytoplasm of cells from all vulvar tissue samples (Fig. 2). PLXDC2 and GNB3 evaluation in tumor adjacent normal tissue  GNB3 immunostaining was significantly higher in VSCC grade 1 and 2 (P ¼ 0.01) and trended toward greater expression in cases without vascular invasion (P ¼ 0.068; Table 4). PLXDC2 expression was higher in cases without vascular and perineural invasion (P ¼ 0.011 and P ¼ 0.032, respectively; Table 4).
Patients who expressed more GNB3 experienced higher rates of disease-free survival (DFS). Conversely, high levels of PLXDC2 trended toward lower DFS and cancer-specific survival (CSS; Fig. 3).  High expression of GNB3, concomitant with low PLXDC2 levels, was detected in cases with little lymph node (0 or 1) metastasis (P ¼ 0.016), characterized as FIGO I or II (P ¼ 0.037; Table 5), and higher DFS rates (P ¼ 0.005; Supplementary  Fig. S4).

Discussion
Compared with normal DNA, we observed a few number of copy number losses in vulvar carcinoma samples, which has not been reported. However, most studies are not comparable with our findings, because they extracted DNA from FFPE (12,13) or cell lines (14,15). Furthermore, some studies controlled for HPV, used disparate methods or lower-resolution platforms (12)(13)(14)(15).
Gains in the long arm of chromosome 3 were frequent in our casuistic, which is a common finding in studies on VSCC (12)(13)(14)(15) and other solid tumors (24). However, the normally concomitant findings with this imbalance (e.g., gains in chr8q and losses in chr3p and chr8p) were not observed. These alterations are generally found in solid tumors (24) and might mediate their development and progression (24). Yet, such abnormalities have not been shown to be important in vulvar cancer.
Thus, we proposed a novel approach, performing integrative analyses to identify driver genes that are involved in the initiation and/or progression of VSCC and determine the function of certain regions and genes that have been implicated. Possible driver selection for VSCC used CONEXIC as starting point for further selection. This method is not a usual method of aCGA and GWE integration as previously used by Micci and colleagues (2013;ref. 16). The CONEXIC methodology uncovers the "genomic footprint" by identifying genes located in regions amplified or deleted in some, but not all tumors and further associates their expression to a gene module, which is assumed to be altered by the driver (19). All this comparisons are made on the basis of validated algorithms which select the best candidate based not only in the assumption that altered copy number is leading to gene misregulation (19). Therefore, although Micci and colleagues (2013) has higher resolution and used similar quantity and quality of samples, this study has major discrepancies in terms of gene selection (16). In addition, this might explain the different pattern of alterations found in both studies as our study follows a different pathway to uncover the main actors in VSCC.
As expected, based on JISTIC analysis, loss of DNA copy number, concomitant with gene downregulation, was uncommon, which we detected in 1 of 47 potential drivers of VSCC. Thus, plexin domain-containing 2 (PLXDC2), a 473-kb gene in crh10p12.31, was selected for validation, despite scarce information on its involvement in cancer. Moreover, the lack of reports on its interactions and function might explain its seemingly minor contribution to biologic and molecular disorders, as evaluated in silico by IPA.
The enrichment analysis of 47 genes demonstrated that they interact in three principal networks that regulate several functions of cancer development and progression, corroborating their selection as potential drivers. One such molecular influences was the involvement of guanine nucleotide-binding protein (G protein) beta polypeptide 3 (GNB3) in carbohydrate metabolism, which is essential for cell growth and maintenance. Thus, GNB3 was also selected for further validation and evaluation in a larger subset of samples.
PLXDC2, also known as TEM7R (tumor endothelial marker 7-related), has a related gene, PLXDC1, which was initially termed tumor endothelial marker 7 (TEM7; ref. 25). Although PLXDC2 is usually upregulated with PLXDC1 in stromal endothelial cells of several tumors, such as colorectal cancer (25), it does not appear to govern endothelial cell morphogenesis in these tumors like PLXDC1 (25). Instead, PLXDC2 participates in apoptosis and cellcycle arrest (26) and is also associated with poor outcomes, a worse prognosis, and lymph node metastasis in breast cancer (27). Regardless of the increasing association of this gene with tumor behavior, its function and interactions have not been examined.
There is significant evidence that supports our assumption that PLXDC2 and its protein mediate the development and progression of VSCC, primarily in association with GNB3 expression. As observed in this study, low levels of PLXDC2, concomitant with high expression of GNB3, mitigate lymph node metastasis, which is the major prognostic factor of VSCC (28). This association correlates with a better prognosis, because it is related to earlystage tumors (FIGO I and II) and higher DFS rates.
That PLXDC2 does not act as a single modulator is corroborated by the lack of an association of GNB3 alone with lymph node metastasis. GNB3 is 7 kb and lies at chr12p13. Our study is the first to report the overexpression of GNB3 in cancer, concomitant with copy number gains. The few studies that have analyzed GNB3 examined the C825T SNP, which is associated with enhanced G protein function, increasing signal transduction through an alternative splice variant, referred to as Gb3s (29,30). The behavior of this variant depends on the tumor site. It is a good prognostic factor, reducing the risk of bone metastasis in patients with breast cancer (31) and increasing the survival of patients with glioblastomas (30) but is associated with worse outcomes for patients who carry this SNP, as in head and neck tumors (32).
We could not assume the status of allele 825 of GNB3; consequently, further studies might determine whether the upregulation of GNB3, concomitant with DNA gains, enhances its G protein activity and whether other alterations are present. Moreover, multimodal approaches (e.g., miRNAs and methylation analysis), integrated with our results, might increase our understanding of VSCC.
Our findings reveal novel aspects of VSCC and tumor behavior. In our previous study, we suggested that p16 is not completely linked with HPV infection, as has been proposed for years, based on the studies of cervix carcinomas, which are almost exclusively related to this infection (5). We hypothesize that there are genes less frequent and studied that mediate VSCC. Thus, our study describes a new model that does not involve the preconceived etiogenic pathways that are linked exclusively to HPV infection or p53 disruption (2,33), because no gene was associated with either pathway in our analysis. In addition, low expression of both GNB3 and PLXDC2 in normal tissues might reflect their relevance in vulvar carcinogenesis and in tumor progression as they could reflect a suppression mechanism feedback, imposed by PLXDC2 and a tumoral enhancer promoted by GNB3 expression. This study proposes two novel genes, based on a robust integrative method that has never been used before to determine the prognosis of vulvar carcinoma. When inversely expressed, PLXDC2 and GNB3 predict lymph node metastasis and DFS, guiding the use of a more aggressive treatment modality and increasing the chances of curing VSCC patients.

Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.