The roadmap for clinical validation of a genomic classifier is defined by the intended demonstration of clinical utility. The design of the initial discovery cohort lays the foundation for such future validation studies. For example, the gene expression ratio of HOXB13/IL17BR (part of the Aviara Breast Cancer Index) was discovered as a prognostic index for ER-positive, tamoxifen-treated breast cancer patients. Subsequently, the initial validation of this ratio was completed in an independent set of ER-positive, tamoxifen-treated patients.8
Another crucial consideration in defining the roadmap for clinical validation is obtaining successful results from the set of samples that will convince both the oncology community and third-party payers that a particular genomic classifier is valid. More than 10 years ago, Hayes proposed levels of evidence (I-V) for grading the clinical utility of tumor markers, which remains a guidance for tumor marker validation.9 Hayes proposed that the highest level of evidence (level I) for a tumor marker would come from a single, high-powered, prospective, randomized, and controlled study specifically designed to test a marker. Notably, currently available tests that have been endorsed by oncologists and payers have only reached level II on the Hayes scale, which is validation within a previously conducted randomized clinical trial using archived specimens.
Clinical validation of genomic classifiers as being prognostic versus predictive requires samples from different patient cohorts. A prognostic test predicts clinical outcomes (e.g., good or poor) regardless of whether or not there is treatment. For example, MammaPrint is an FDA-cleared prognostic test using a 70-gene expression profile of fresh or frozen breast cancer tissue samples to assess a patient’s risk for distant metastasis, and is not indicated for predicting response to therapy. The validation cohort was 302 patients who did not receive adjuvant systemic treatments (i.e., no drug therapy).10
In contrast, validating a predictive genomic classifier requires examining samples from a previously conducted randomized clinical trial. This examination is required for optimal validation of the therapy benefit from, or response to, the adjuvant postsurgical treatment. For example, the recurrence score from OncotypeDX was demonstrated to predict chemotherapy benefit in ER-positive, node-negative breast cancer patients by correlating it with the outcomes from a previously conducted clinical trial. In this trial, the National Surgical Adjuvant Breast and Bowel Project (NSABP B-20), patients were randomly selected to receive either tamoxifen or tamoxifen plus chemotherapy. The study showed there was a statistically significant interaction between the recurrence score and patients who received chemotherapy.11
In another example, investigators reported that the HOXB13/IL17BR index predicted endocrine benefit for ER-positive patients. This study correlated the index to the clinical outcomes from a previously conducted randomized trial in which some patients received an additional three years of tamoxifen while others did not.12
To validate clinically genomic classifiers requires careful consideration, including a prespecified statistical plan with predetermined cutoff points for a given genomic classifier (e.g., REMARK criteria).13 In particular, a critical exercise that is sometimes overlooked is conducting a power analysis to estimate whether the sample size under consideration is adequate to demonstrate statistically the prognostic or predictive utility of the genomic classifier. This type of study takes a long time, usually 1-3 years, and includes submission of a proposal, obtaining approval from a governing body, receiving the necessary samples, conducting the study and the analysis, and publishing the findings.As mentioned above, the highest level of validation is conducting a prospective randomized trial in which the genomic classifier is used to guide treatment. In most disease states, this trial takes 5-10 years in order to obtain the necessary clinical outcome data. For example, the Trial Assigning Individualized Options for Treatment (Rx) will examine whether the OncotypeDX recurrence score can be used to determine which patients will benefit from chemotherapy. In another study, Nevins is conducting a randomized clinical trial to determine whether a prognostic genomic classifier can identify those early-stage non-small-cell lung cancer patients who typically receive surgery alone as a treatment option but may also benefit from adjuvant chemotherapy.