The Cancer Target Discovery and Development (CTD2) Network was established to accelerate the transformation of “Big Data” into novel pharmacologic targets, lead compounds, and biomarkers for rapid translation into improved patient outcomes. It rapidly became clear in this collaborative network that a key central issue was to define what constitutes sufficient computational or experimental evidence to support a biologically or clinically relevant finding. This article represents a first attempt to delineate the challenges of supporting and confirming discoveries arising from the systematic analysis of large-scale data resources in a collaborative work environment and to provide a framework that would begin a community discussion to resolve these challenges. The Network implemented a multi-tier framework designed to substantiate the biological and biomedical relevance as well as the reproducibility of data and insights resulting from its collaborative activities. The same approach can be used by the broad scientific community to drive development of novel therapeutic and biomarker strategies for cancer. Mol Cancer Res; 14(8); 675–82. ©2016 AACR.
Large-scale molecular characterization projects are generating comprehensive data for pediatric and adult malignancies, from hundreds to thousands of patient-derived samples (1–3), transgenic mouse models (4), patient-derived xenografts, and cancer cell lines (5–7). These data allow systematic evaluation of key biologically and clinically relevant hypotheses, such as the association between drug sensitivity and specific genetic alterations, or between specific biological features and patient outcome. As a result, biomedical discovery is being increasingly driven by the integrative analyses of large amounts of data followed by experimental evaluation both in vitro and in vivo. The challenge is to capitalize on these different data sources in a systematic way that makes the process of target discovery and translation more efficient, transparent, and reproducible.
Such transition from a strictly hypothesis-driven to an increasingly hypothesis-generating paradigm presents new types of challenges. For instance, to what extent can knowledge be extracted from computational analysis of large-scale data repositories without conventional follow-up experimental validation? If experimental validation is needed, what can be considered an appropriate level of validation to justify follow-up preclinical and clinical studies? These questions are especially relevant in view of critical challenges to the very foundation of the biomedical research enterprise, from result reproducibility (8, 9) to biomedical impact (10).
The Cancer Target Discovery and Development (CTD2) Network (http://ctd2.nci.nih.gov/) was established with the specific intent to accelerate the transformation of “Big Data” into novel pharmacologic targets, lead compounds, and biomarkers for rapid translation into improved patient outcomes. This process includes the development of novel methods that enable the identification and validation of actionable therapeutic targets. Specifically, in the “Big Science” classification system of Sean Eddy, the CTD2 Network aims to be a “leading wedge”—democratizing breakthrough technology for validating cancer therapeutic targets to all laboratories, an urgent medical need requiring radically improved methods (11). With 13 centers collaborating in the context of CTD2 Network, the issue of what constitutes sufficient computational or experimental evidence to support a biologically relevant finding becomes central. Indeed, to a large extent, the multi-Center Network represents a microcosm of the complex interactions that drive biomedical translation forward in the broader context of academic and industrial research.
The questions discussed in this article represent the specific challenges faced by this group of researchers as they began to develop successful, multi-center collaborations leading to numerous publications and clinical translation efforts. Specifically, CTD2 investigators quickly realized that, while each Center was an expert in the methodologies related to a specific aspect of biological discovery—from Big Data analysis to large-scale chemical-biology screens to pooled functional assays—the ability to operate at the intersection of these methodologies, especially in terms of quality control and data is a major challenge. For instance, the specific quality control infrastructure and mechanisms necessary to ensure reproducibility of Big Data analyses or in vivo functional assays are quite different. Thus, collaborations that leverage more than one data modality require potentially orthogonal communities to develop a cross-disciplinary understanding of their individual competencies.
This is a vast and complex undertaking, and as such, this perspective cannot be interpreted as a fully finished and comprehensive framework to support interdisciplinary collaborations. Rather, it represents the first essential step in motivating the community to address several critical issues on a systematic and comprehensive basis. Indeed, we envision this effort as the first of a series of manuscripts representing both a dialog and a resource for the community, which may be especially useful to young investigators and trainees as they face the complexity and challenges of large-scale collaborative research efforts that are emerging as necessary to address an equally increasing complexity of biological discovery.
In this vein, this Perspective begins to delineate the challenges of supporting and confirming discoveries arising from the systematic analysis of large-scale data resources in a collaborative environment. We provide a framework to start addressing the pivotal question: “What level of experimental evidence is necessary to complement insights derived from Big Data analysis in order to reach its potential to impact human health positively?” Although our focus in this Perspective is based entirely on a system that we have adopted within the collaborative CTD2 Network to disseminate the hypotheses and insights resulting from this Network's research, the approach and methodology is generalizable, and is thus not limited to CTD2 Network activities. Furthermore, we hope that the research community will use this initially sparse framework to provide increasingly in-depth insight and mechanisms to address quality control and reproducibility at the boundary of the multiple and highly complementary subdisciplines of biological investigation.
The Network implemented a system to ensure that data and insights resulting from its activities would be reproducible (12), and would thus be used by the CTD2 or the scientific community to drive the development of novel therapeutic and biomarker strategies for cancer. We also hope to provide the scientific community a framework for the effective reporting of data generated by these and other methods in future applications of the knowledge derived from Big Data for biological insight. This Perspective should be seen as the first step in elucidating the critical challenges and deriving a framework that will begin a community-wide discussion of the challenges that will lead to a more detailed description of the metrics needed for specific research technologies such as proteomics and drug screening analysis that have not already been systematically evaluated in the literature and through consensus white papers.
Data Sharing and Clarity
To enable clear description of novel therapeutic targets and pathways identified by the CTD2 Network, the members have devised a set of classification criteria to stratify targets, pathways, associated biomarkers, and their small-molecule or biological modulators into “Evidence Tiers,” based on available supporting evidence. These criteria should also enable the scientific community to understand more easily the evidence on which CTD2 and other findings are based. A key priority of this lexicon is to minimize the misinterpretation of reported results, thereby helping the scientific community to understand the likelihood that the interpretations can be successfully progressed to human investigations (9).
Network Centers share their methods and results via publications, and all raw and analyzed data are made publicly available through the CTD2 Data Portal (https://ctd2.nci.nih.gov/dataPortal/). This resource is regularly updated as new data are generated and additional findings are validated. All data posted in the Data Portal have undergone quality-control evaluation, but have not independently been confirmed (Fig. 1).
We consider that Data Portals are necessary, but not sufficient, for ensuring clarity and reproducibility of published results. Therefore, classification of supporting evidence resulting from analysis of CTD2 data into Evidence Tiers (see below) informs a separate web-based “Dashboard” (http://ctd2-dashboard.nci.nih.gov/)—a platform to share CTD2 Network findings with the research community. The Dashboard is intended to house the results that connect targets, biomarkers, and modulators with evidence supporting their validation. In each Evidence Tier, we enumerate information related to three entities that are critical to the development of cancer therapeutics. These include: (1) molecular targets, (2) small-molecule or biological modulators of the targets, and (3) associated predictive or prognostic biomarkers for patient selection.
Evidence Tier Definitions
Tier 1: Preliminary positive observations
These data represent the initial results of high-throughput experiments, typically using a single experimental or computational platform; examples are given for illustration purposes only.
These include positive results (“hits”) from primary high-throughput screens (HTS) with small or large molecules such as antibodies (the term small-molecule screen is used throughout to include the potential for large-molecule screens). Other means of high-throughput data generation (e.g., cheminformatic analyses, correlations of genomic and cheminformatics data, patents, published literature, etc.) may also inform a small-molecule approach and are thus acceptable.
Genetic perturbation assays.
These include results from high-throughput screening experiments [such as whole-genome or targeted RNA interference (siRNA or shRNA) loss of function, CRISPR, or open-reading frame cDNA (ORF) gain of function screens], as well as computational and statistical analyses that either support hit selection or filter out potential artifacts.
Prognostic or predictive biomarkers.
Biomarker discovery involves the use of primary data from any number of sources (e.g., cell lines isogenic for a mutant vs. wild-type candidate target, sequencing, biologics screening, etc.). Where possible, a biomarker is linked to specific molecular targets and small-molecule modulators. Tier 1 biomarker data could enable (1) identification of the patient population likely to benefit from a given therapeutic strategy, (2) quantification of efficacy in vivo for preclinical or clinical measurements, (3) nomination of a pharmacodynamics readout for drug activity in a patient, or (4) development of an indicator for potential undesirable off-target effects or adverse events. Tier 1 biomarkers derive directly from the analysis of the primary data and are not yet independently validated.
Discovery of molecular interactions from protein–protein interaction (PPI) assays, protein–nucleic acid interaction assays, or computational analyses, among others, can be used to generate hypotheses for specific molecular targets. Examples of relevant assays include genome-wide chromatin immunoprecipitation (ChIP-seq) or cross-linking immunoprecipitation (CLIP-seq), reporter-gene activity, protein fusions (e.g., luciferase or fluorescent protein) in protein-complementation assays, yeast two-hybrid assays, Förster-(fluorescence-) resonance energy transfer (FRET), protein complex identification by mass spectrometry, and reverse engineering computational algorithms.
Any computational approaches and strategies including the analysis of, for example, alternatively spliced transcripts and cell surface markers and other factors that might stimulate the immune system can identify candidate cancer targets or pathways. The primary data can be from the public domain or project-specific high-throughput assays that are released into the public domain and are clearly referenced.
NOTE: In Tier 1, the data can support many concepts and not just targets, biomarkers, or small-molecule perturbagens.
Tier 2: Confirmation of primary results in vitro
These results meet the Tier 1 requirements and have been confirmed by at least one of the following:
More detailed version of the original assay, such as concentration–response versus single-point, high-replicate (e.g., N > 4) versus low-replicate (e.g., singleton), target silencing in additional patient-relevant cell lines or models, or results from high-content microscopy experiments, etc.
Orthogonal secondary assay or counter-screen.
Independent confirmatory experiment with the original assay, performed by a different Center in the Network or from the literature.
Extension or corroboration of experiments in other in vitro cancer models.
Experimental investigations confirming the presence on the cell surface of bioinformatically predicted cell-surface peptides and epitopes.
Generation of an antibody, chimeric antigen receptor (CAR), T-cell receptor, or other targeting molecule for cancer-specific epitope expressed on tumor surfaces.
Examples of evidence for Tier 2 are given below. Sufficient detail must be captured in the Dashboard (data, methods, reagents, etc.; http://ctd2-dashboard.nci.nih.gov/) to enable qualified investigators to reproduce the results.
The data include a detailed characterization of each candidate using the primary or suitable orthogonal assays. These data define the desired biological profile and may include testing in additional patient-relevant cell models. Some measure of potency and selectivity is established (e.g., selective toxicity for cancer cells over an appropriate normal cell model).
Genetic perturbation assays.
A more detailed characterization of gene candidates is required to meet the standard criteria of demonstrating that at least two independently designed genetic perturbation reagents produce the same effect. These experiments could use RNAi, CRISPR, Transcription Activator-like Effector Nucleases, or ORF reagents. In addition, the possibility of an effect being due to miRNA seed sequences should be addressed. Validation of a reagent can occur through direct measurements (mRNA depletion, protein levels), computational approaches (measurements of reagents made in multiple cell lines or assays), or a combination thereof.
The specifics of a more detailed characterization of biomarkers depend on the utility of the biomarker being developed. For biomarkers aimed at identifying the patient population likely to benefit from a given therapeutic strategy, confirmation in an independent, appropriately statistically powered population for which relevant molecular profile data can be accessed is required. Biomarkers that quantify efficacy for preclinical or clinical measurements require validation in independent model systems. Biomarkers providing a pharmacodynamics profile for drug activity in a patient require confirmation in a distinct, appropriately statistically powered preclinical model-organism cohort. Biomarkers serving as an indicator for potential undesirable off-target effects or adverse events need confirmation in an independent large sample set of cell lines or other biological samples.
Interactions require confirmation in orthogonal screening experiments in a biologically relevant context using a different readout than used in Tier 1. Computational data from the literature or public databases could also support molecular interaction–based targets in this Tier.
Hypotheses inferred by computational analysis must be confirmed by experimental analysis to reach Tier 2, which could be accomplished by interventions or through data in the literature. In addition to experimental approaches highlighted above, immunotherapeutic targets or molecules, such as antibodies, CARs, or T-cell receptors to novel cell-surface epitopes either in cell lines or mass spectroscopy measurements are needed and tested in cell lines.
NOTE: In Tier 2, an in vitro–validated target is progressing toward a substantiated hypothesis; nonetheless, the absolute connection is not yet complete.
Tier 3: Validation of results in a cancer-relevant in vivo model
These results meet the Tier 2 requirements and have been validated by in vivo assays, including at least one of the following:
Experiments in model organisms (e.g., Danio rerio, Mus musculus, Drosophila melanogaster, etc.).
A second, separate orthogonal secondary assay or counter-screen on a single gene in an in vivo system.
Independent validation assays in vivo (secondary assay or counter-screen) by the same or a different Network Center, or from the literature.
For biomarkers only, extension or corroboration of validation experiments in large independent human cancer sample cohorts with appropriate clinical data.
In Tier 3, the functional hypothesis is effectively tested and, as needed, either modified or removed based on orthogonal experimental evidence. It is expected that assays are performed in carefully controlled experiments in vivo (e.g., on a molecule-by-molecule basis in relevant models such as xenografts, genetically engineered mouse models, syngeneic tumor models, organoids, patient-derived xenografts, or other biological systems), at least in quadruplicate, that allow definitive conclusions. Cell-surface epitopes can be targeted either by a small molecule, an antibody, an antibody-derivative protein, or a T-cell receptor. The following types of evidence are offered at this Tier.
The data presented include orthogonal assays that further support the profile of selected compounds as being consistent with the therapeutic hypothesis. The difference from Tier 2 is 2-fold: (1) At Tier 3, the experiments are carried out in vivo, and (2) proof of mode-of-action is necessary (e.g., mitotic arrest using an image-based assay or identify the gene or molecular alteration that leads to the cancer dependency of the small-molecule activity).
Genetic perturbation assays.
Relationships that are observed by multiple groups, or a more detailed characterization of gene (cancer dependency) candidates, are required, using in vivo model systems for validation. Lower-throughput experiments that further support the specificity of the loss of gene function (or gain) and the importance of that loss (or gain) to the proposed hypothesis are needed.
Biomarkers require a more thorough demonstration of their reliability than at Tier 2 including, for example, statistically significant evidence from an appropriate clinically annotated patient cohort independent from those used for Tier 2. For all types of biomarkers (see definitions in Tier 1), the assay used is either performed independently by another Center, or with a different technology platform to measure and detect the biomarker(s). As an alternative to a different technology platform, the same platform could be validated in terms of robustness and reproducibility meeting the requirement of a CLIA-like (Clinical Laboratory Improvement Amendments) assay.
Demonstration of direct endogenous molecular interactions, under physiologically relevant conditions, is required. Experimental evidence demonstrating that a molecular interaction–associated target is mechanistically, or at least functionally, relevant to cancer is required. This relevance is demonstrated by measuring the effect on tumor initiation, progression, or maintenance resulting from disrupting or stabilizing the interaction using mutagenesis, peptides, or small molecules. Efficacy of an antagonist peptide or small molecule in a panel of clinically relevant cell lines or in vivo models is required. Aberrant or neomorph-related interactions are identified as distinct from those occurring in a cell's physiologic regime.
NOTE: The evidence in Tier 3 is a robust definition of connection of small molecule (chemical or biologic) to target, genotype to phenotype, and direct molecular interaction in vivo, such as mouse tumor models, patient-derived tumor avatars (e.g., organoids, conditionally reprogrammed cells, xenografts), or patient cohorts.
Substantiated Hypotheses toward Human Investigation
CTD2 aims to undertake and report research that can be validated by qualified investigators inside and outside the Network and that can lead to clinical applications, including the generation of therapeutic agents whose activity is predicted by specific molecular alterations in a patient's tumor. We anticipate that substantiating hypotheses from the CTD2 Network will be based on a combination of evidence types from different Tiers, providing strong rationale for a candidate target with an agent that modulates the cancer phenotype, together with an associated biomarker for patient selection. Importantly, we expect such results to be replicated independently of the Center that generated the initial data.
Substantiated hypotheses should include all the relevant information necessary for their translation. Additional perspectives on what qualifies as a substantiated hypothesis are provided below:
A candidate therapeutic target should be accompanied by biomarkers for the stratification of patients most likely to derive benefit from its use, small-molecule or peptide modulators that are most likely to modify target activity, and proof of mechanism of action. Substantiated hypotheses include compelling evidence that supports translation to clinical trials.
Small molecules should be accompanied by data that indicate appropriate properties for testing in vivo (e.g., suitable metabolic half-life, minimal toxicity, appropriate in vivo exposure, etc.) and that substantiate statistically significant differential efficacy. This substantiation may require synthesis or purchase of analogs that address compounds with shortcomings in one of these areas. A detailed optimization strategy (chemistry, pharmacokinetics, pharmacodynamics, etc.) is important. Alternative acceptable surrogates include, but are not limited to, other targeted therapeutics such as monoclonal antibodies or soluble receptors.
Genetic perturbation assays
Selective effects of targets perturbed by multiple genetic reagents (shRNA/siRNA/Clustered Regularly Interspersed Short Palindromic Repeats) are explored with in-depth biological experimentation that includes in vivo interventions. This selectivity may be demonstrated in a panel of clinically relevant patient-derived xenographs, or in transgenic or xenograft mouse models using shRNA or genetic ablation of the target. Confirmation of suppression in vivo is necessary, and multiple endpoints (tumor burden, survival, etc.) should be shown with appropriate statistical significance.
Biomarkers as listed in Tier 3 require additional development and implementation of an analytical test system with well-established performance characteristics and cut-offs. Any associated algorithms are “locked” in terms of coefficients and other parameters. The analytic test system is used to evaluate the performance of the type-specific potential biomarkers on an independent validation patient-sample cohort. In addition, a credible scientific framework that explains the physiologic or clinical significance of the test results is required.
Mutational analysis of cancer-specific variants could provide supporting data for molecular interactions. Aberrant or neomorph-related interactions should be identified as distinct from those occurring in a cell's physiologic regime. Biomarkers that reflect the status of the molecular interaction target should be provided for translational studies.
Examples of Transition Through the Evidence Tiers to Clinical Trials
To provide examples of how this framework facilitates the discussion and comparison of different types of targets, we describe three targets that were identified by CTD2 members and for which further validation experiments provided substantiated hypotheses, now in testing in clinical trials, also see Fig. 2.
Vignette 1: WEE1 inhibitor MK1775 (AZD1775) in head and neck squamous cell carcinoma
The cell-cycle checkpoint kinase WEE1 is an illustrative example of advancing a target through Tiers of evidence from discovery to preclinical validation, leading all the way to a clinical trial.
The Fred Hutchinson Cancer Research Center's CTD2 performed an unbiased siRNA kinome screen, using both mouse and human squamous cell carcinoma cells, that provided Tier 1 level of evidence for WEE1 as a cancer target (13). Specifically, siRNAs to WEE1 were among the most effective at inhibiting growth of p53-mutant head and neck squamous cell carcinoma (HNSCC) cells. Retesting with different siRNAs to WEE1 in additional cell lines and using a small-molecule inhibitor of the WEE1 kinase, MK1775 (now AZD1775), provided Tier 2 evidence for WEE1 as a cancer target. Disruption of G2–M regulation by inhibition of WEE1, particularly in the context of p53 mutation and DNA damage, leads to apoptotic cell death.
Tier 3 evidence was obtained by inhibition of growth of p53-mutant HNSCC xenografts in mice treated twice weekly with AZD1775, as well as confirmation of target engagement by inhibition of WEE1 kinase activity in tumor extracts. These studies led to an investigator-initiated clinical trial of AZD1775, in combination with neoadjuvant weekly docetaxel and cisplatin, prior to surgery in HNSCC (NCT02508246; E Mendez, Principal Investigator). Of note, the gene encoding WEE1 is not mutated in tumors and may represent yet another example of a therapy targeting cancer-specific vulnerabilities (13).
Vignette 2: JAK2 inhibitor ruxolinitib in trastuzumab-relapsed ERBB2-amplified breast cancer
Breast cancers that present aberrant activity of the ERBB2 receptor tyrosine kinase are treated with trastuzumab, a monoclonal antibody that acts as a specific ERBB2 inhibitor. However, a substantial number of patients who initially respond to trastuzumab (up to 70%) will eventually relapse with tumors that are drug-resistant and have poor prognosis.
The Columbia University's CTD2 combined results from pooled RNAi screens in MCF10A cells, followed by ectopic ERBB2 expression, with network-based analysis of master regulator proteins in ERBB2-amplified breast cancer patients from The Cancer Genome Atlas. This analysis revealed STAT3 as a critical master regulator of ERBB2-amplified tumors in ER-/ErbB2Amp patients, as well as a critical dependency of the transformed MCF10A cells (Tier 1). Such an approach highlighted how independent evidence from large-scale computational and experimental assays can provide complementary clues that lead to identification of biological mechanisms with high potential for successful experimental validation, both in vitro and in vivo. Specifically, unless they are performed in a very large number of phenotypically relevant cellular contexts, pooled RNAi screens are often not sufficiently selective to pinpoint generalizable tumor dependency mechanisms. One concern, for instance, is that these screens may highlight idiosyncratic dependencies induced by the nonphysiologic nature of the cell line context used in these assays. In contrast, network-based analysis of Big Data from human samples to identify master regulators of tumor cell state has shown remarkable ability to pinpoint functional drivers, with validation rates in the 70% to 80% range. Yet, the latter must still be experimentally validated to separate truly biological dependencies from potential computational artifacts (4, 14, 15). The use of combined pooled RNAi screens with computational, network-based approaches addresses both issues providing clear complementarity and thus allowing efficient and systematic functional driver elucidation, leading to extremely high success and validation rates in follow-up in vivo and clinical studies. For instance, in this case, hundreds of potential candidates from pooled RNAi screens and tens of candidates from master regulator analysis resulted in a core of three drivers that were validated in follow-up assays, including one (STAT3) providing critical insight for the development of combination therapy in relapsed HER2+ breast cancer.
Indeed, further investigation in cell lines showed that aberrant STAT3 activity was regulated by an autocrine loop involving expression and secretion of IL6 and induced expression and secretion of the S100A and S100B hetero-dimerizing isoforms linked to aggressive breast cancer malignancy. Specifically, IL6-mediated activation of the IL6 receptor, upstream of the JAK/STAT cascade, closed the resulting signaling loop, resulting in STAT3 autoregulation. Depending on kinetics of the signaling loop, the latter could become ERBB2 independent, thus inducing trastuzumab resistance. Conversely, when combined with trastuzumab, abrogation of JAK2 activity, either genetically or via the small-molecule inhibitor ruxolitinib, abrogated STAT3 activity, reduced S100A/B expression, and induced profound synergistic loss of viability in trastuzumab-resistant cell lines (Tier 2 evidence) and xenografts (Tier 3), which was reversed by ectopic S100A/B expression. These multi-Tier findings led to development of a clinical trial to test the trastuzumab–ruxolitinib combination in ERBB2-amplified breast cancer patients who had relapsed following trastuzumab therapy and no longer responded to the drug (NCT02066532; ref. 16).
Vignette 3: TBK1 inhibitor momelotinib in lung and pancreatic cancer
Oncogenic mutations in KRAS occur in nearly all pancreatic cancers as well as a significant number of lung and colon cancers. Although KRAS is a well-validated oncogene involved in both tumor initiation and maintenance, targeting KRAS pharmacologically has proven challenging. The Dana-Farber Cancer Institute CTD2 Center performed an arrayed shRNA screen to identify genes that were required in cancer cell lines that are dependent on the expression of mutant KRAS, and identified the serine-threonine kinase TBK1 as a codependency in such cells (17), making it a Tier 1 target. Subsequent studies by this Center and others (18, 19) identified TBK1-dependent induction of autocrine signals as the reason for this dependency and demonstrated that small-molecule inhibition of TBK1 reproduced the genetic findings (ref. 20; Tier 2). Indeed, these studies help explain why studies involving pooled shRNA screens fail to identify TBK1 because in such massively parallel screens, signaling loops involving secreted molecules are not interrupted. In addition, genetic and pharmacologic perturbation of TBK1 induced tumor regression in genetically engineered mouse models of lung cancer driven by K-ras (ref. 20; Tier 3). Three clinical trials testing this TBK1 inhibitor, momelotinib, have been started in patients with lung and pancreatic cancer (NCT02258607, NCT02101021, and NCT02244489).
We introduce a multi-Tier framework designed to provide an approach to substantiate the biological and biomedical relevance, as well as the reproducibility, of novel biomedical insights arising from analysis of Big Data. Such an approach allows the systematic identification of relevant insights derived from large-scale data analyses, through a series of increasingly strict filters, rather than through a single monolithic filter, whose failure may compromise the entire validation process. The approach is not meant to be prescriptive but rather to represent the minimal data elements ensuring biological and clinical relevance. Indeed, we expect that there will be insufficient evidence to credential many targets or small molecules initially classified as Tiers 1 to 3 as substantiated hypotheses. Nevertheless, this framework permits one to classify potential targets, biomarkers, or small molecules based on the available information.
We define Evidence Tiers to clarify the levels of validation for pharmacologically accessible therapeutic targets, associated biomarkers, and biochemical modulators. In each Tier, we enumerate information related to three entities: (1) molecular targets, (2) associated predictive or prognostic biomarkers for patient selection, and (3) small-molecule or biological modulators of the targets that are critical to the development of cancer therapeutics. Effective representation of specific hypotheses will often involve multi-modal evidence from different Tiers (e.g., a target with associated biomarkers for stratification and pharmacodynamics and a set of small-molecule modulators). We expect that systematic availability of these evidence Tiers, with additional substantiation by other CTD2 investigators and extra-Network investigators, will motivate the use of Network-generated insight and knowledge for clinical investigation.
Experiments supporting the substantiation of any hypothesis are essential before any target is prioritized for development, either by Network Centers, other investigators, or by biotechnology or pharmaceutical companies. The Evidence Tiers defined in this document help delineate and communicate the complex process of data analysis, starting from large genomic or functional data sets, and ending with the generation of preclinical leads for characterizing targets, small molecules, and biomarkers. We expect that new information generated by the CTD2 Network and others will inform an improved definition of the concepts we present here. We hope that the principles of Tiers of evidence as applied in the CTD2 Dashboard will be useful in other contexts and thereby provide confidence in the quality, clarity, and reproducibility of research performed in the public sector.
Disclosure of Potential Conflicts of Interest
P.A. Clemons is a consultant/advisory board member for Forma Therapeutics. A. Califano is a cofounder and chief scientific advisor at Darwin Health; reports receiving a commercial research grant from Merrimack; and is a consultant/advisory board member for Thermo Fisher and CGi. E. Mendez is a consultant/advisory board member for Sengine. V.K. Gadi is a consulting director at Sengine Precision Medicine and reports receiving a commercial research grant from Genentech. C.J. Kemp has ownership interest (including patents) and is a consultant/advisory board member for Sengine Precision Medicine. S.R. Riddell reports receiving a commercial research and has ownership interest (including patents) in Juno Therapeutics and is a consultant/advisory board member for Juno Therapeutics and Cell Medica. J.S. Weissman is a founder, consultant at KSQ Therapeutics and is a consultant/advisory board member for Driver Group and KSQ Therapeutics. T.G. Bivona reports receiving a commercial research grant from Ignyta and Servier and is a consultant/advisory board member for Array Biopharma, Novartis, Driver Group, Astellas, Ariad, and Natera. F. McCormick is a consultant at Leidos and reports receiving a commercial research grant from Daiichi Sankyo. G.B. Mills reports receiving a commercial research grant from Adelson Medical Research Foundation, AstraZeneca, Clinical Outcome Technologies, Komen Research Foundation, Nanostring, Breast Cancer Research Foundation, Karus, Illumina, Takeda/Millenium Pharmaceuticals; has received speakers bureau honoraria from Symphogen, MedImmune, AstraZeneca, ISIS Pharmaceuticals, Lilly, Novartis, ImmunoMet, and Allostery; has ownership interest (including patents) in Catena Pharmaceuticals, PTV Ventures, Spindletop Ventures, Myriad Genetics, and ImmunoMet; and is a consultant/advisory board member for Adventist Health, AstraZeneca, Provista Diagnostics, Signalchem Lifesciences, Symphogen, Lilly, Novartis, Tarveda, Tau Therapeutics, Allostery, Catena Pharmaceuticals, Critical Outcome Technologies, ISIS Pharmaceuticals, ImmunoMet, Takeda/Millenium Pharmaceuticals, MedImmune, and Precision Medicine. M.G. Roth reports receiving a commercial research grant from AstraZeneca and has ownership interest (including patents) in SynAlpha Therapeutics. No potential conflicts of interest were disclosed by the other authors.
S.L. Schreiber was supported by NIH CA176152, S. Powers was supported by NIH CA168409, A. Califano was supported by NIH CA168426, W.C. Hahn was supported by NIH CA176058, H. Fu was supported by NIH CA168449, C.J. Kemp was supported by NIH CA176303, M. Mcintosh was supported by NIH CA176270, C.J. Kuo was supported by NIH CA176299, M.E. Berens was supported by NIH CA168397, M.T. Mcmanus was supported by NIH CA168370, W.A. Weiss was supported by NIH CA176287, G.B. Mills was supported by NIH CA168394, and M.G. Roth was supported by NIH CA176284.
Contributors to The Cancer Target Discovery and Development Network are as follows:
Broad Institute (Cambridge, MA): Paul A. Clemons, Alykhan Shamji, Cindy Hon, Bridget K. Wagner, Stuart L. Schreiber
Cold Springs Harbor Laboratory (Cold Springs Harbor, NY): Alex Krasnitz, Raffaella Sordella, Chris Sander, Scott W. Lowe, Scott Powers
Columbia University (New York, NY): Kenneth Smith, Mahalaxmi Aburi, Antonio Lavarone, Anna Lasorella, José Silva, Brent Stockwell, Andrea Califano
Dana Farber Cancer Institute (Boston, MA): Jesse S. Boehm, Francisca Vazquez, Barbara A. Weir, Todd R. Golub, William C. Hahn
Emory University (Atlanta, GA): Fadlo R. Khuri, Carlos S. Moreno, Yuhong Du, Lee Cooper, Andrey A. Ivanov, Margaret A. Johns, Haian Fu
Fred Hutchinson Cancer Research Center (1): Olga Nikolova, Eduardo Mendez, Vijayakrishna K. Gadi, Adam A. Margolin, Carla Grandori, Christopher J Kemp (Fred Hutchinson Cancer Research Center, Seattle, WA; Cure First, Seattle, WA; Oregon Health and Science University, Portland, OR; University of Washington, Seattle, WA)
Fred Hutchinson Cancer Research Center (2): Edus H. Warren, Stanley R. Riddell, Martin W. McIntosh (Fred Hutchinson Cancer Research Center Seattle, WA)
Stanford University (Stanford, CA): Olivier Gevaert, Hanlee P. Ji, Calvin J. Kuo
Translational Genomics Research Institute (Phoenix, AZ): Harshil Dhruv, Darren Finlay, Jeffrey Kiefer, Seungchan Kim, Kristiina Vuori, Michael E. Berens
University of California San Francisco (1): Jonathan Weissman, Trever Bivona, Sourav Bandyopadhyay, Matt Hangauer and Michael Boettcher, Michael McManus, Frank McCormick
University of California San Francisco (2): Ozlem Aksoy, Erin F. Simonds, Tina Zheng, Justin Chen, Zhenyi An, Allan Balmain, William A. Weiss
University of Texas M.D. Anderson Cancer Center (Houston, TX): Ken Chen, Han Liang, Kenneth L. Scott, Gordon B. Mills
University of Texas Southwestern (Dallas, TX): Bruce A. Posner, John MacMillan, John Minna, Michael A. White, Michael G Roth
National Cancer Institute (Bethesda, MD): Subhashini Jagu, Jessica N. Mazerik, Daniela S. Gerhard.
- Received March 15, 2016.
- Revision received April 25, 2016.
- Accepted June 2, 2016.
- ©2016 American Association for Cancer Research.