ReportsIMMUNOTHERAPY

Tumor neoantigenicity assessment with CSiN score incorporates clonality and immunogenicity to predict immunotherapy outcomes

See allHide authors and affiliations

Science Immunology  21 Feb 2020:
Vol. 5, Issue 44, eaaz3199
DOI: 10.1126/sciimmunol.aaz3199

Neoantigen Number Crunching

Immunotherapy with anti–PD-1 and other checkpoint inhibitors is an important cancer treatment modality, but improved biomarkers are needed to better predict which patients will respond. Current computational approaches that assess tumor immunogenicity by deep sequencing of tumor samples to count mutations and predict neoantigen epitopes are unable to factor in clonal variation within tumors. Lu et al. developed an algorithm to calculate CSiN score, a metric that also integrates the distribution of mutations among tumor clones. Testing of CSiN score against other indices of tumor neoantigen burden revealed improved correlations with outcome and prognosis in cohorts of patients with tumor types known to be immunogenic. Calculation of CSiN scores from tumor genomics data may assist in selection of patients most likely to benefit from cancer immunotherapy.

Abstract

Lack of responsiveness to checkpoint inhibitors is a central problem in the modern era of cancer immunotherapy. Tumor neoantigens are critical targets of the host antitumor immune response, and their presence correlates with the efficacy of immunotherapy treatment. Many studies involving assessment of tumor neoantigens principally focus on total neoantigen load, which simplistically treats all neoantigens equally. Neoantigen load has been linked with treatment response and prognosis in some studies but not others. We developed a Cauchy-Schwarz index of Neoantigens (CSiN) score to better account for the degree of concentration of immunogenic neoantigens in truncal mutations. Unlike total neoantigen load determinations, CSiN incorporates the effect of both clonality and MHC binding affinity of neoantigens when characterizing tumor neoantigen profiles. By analyzing the clinical responses in 501 treated patients with cancer (with most receiving checkpoint inhibitors) and the overall survival of 1978 patients with cancer at baseline, we showed that CSiN scores predict treatment response to checkpoint inhibitors and prognosis in patients with melanoma, lung cancer, and kidney cancer. CSiN score substantially outperformed prior genetics-based prediction methods of responsiveness and fills an important gap in research involving assessment of tumor neoantigen burden.

INTRODUCTION

Most immunotherapies, including checkpoint inhibitors, benefit only a small subset of patients. For example, anti–PD-1 and anti–PD-L1 agents, which have demonstrated marked clinical benefit in various diseases, have overall response rates ranging from 10 to 50% in melanoma and non–small cell lung cancer (NSCLC) (15) and higher response rates in some other select tumor types, such as classic Hodgkin lymphoma (65 to 80%) (6, 7). Similarly, the efficacy of anti–CTLA-4 agents ranges from 10 to 15% for objective response rates and <3% for complete response (8). Unfortunately, the field is still far from clearly understanding how to distinguish responders and nonresponders to immunotherapy. All forms of immunotherapy, such as checkpoint inhibitors and neoantigen vaccines, seek to activate the host immune system to attack the tumor cells. These forms of immunotherapy have different modes of actions, but most are intended to mobilize the cytotoxicity of T cells in the patient. Neoantigens are the primary targets of T cell responses (913), and the profiles of tumor neoantigens in each patient are central to determining the responsiveness to immunotherapy treatment.

A major impediment of current research that seeks to correlate neoantigens with immunotherapy treatment response is that most studies have only considered whether a higher neoantigen/mutation load (namely, the total number of neoantigens or mutations) is correlated with better immunotherapy response. This overly simplistic approach fails to take full advantage of the wealth of information contained in the entire repertoire of neoantigens and has been successful in only some studies (2, 4, 1418) but not others (1924). Neoantigens are associated with mutations that can be either truncal or subclonal (25). Some neoantigens are also more immunogenic than others. These details are not fully captured by a basic neoantigen/mutation load approach but could be critical for understanding the responsiveness of patients with cancer to immunotherapy treatment. For example, Miao et al. (26) showed that cancer patients with a high proportion of clonal mutations have better rates of checkpoint inhibitor treatment response. The only other study that we are currently aware of that defined a more sophisticated neoantigen-based predictive metric is the neoantigen fitness model developed on the basis of evolutionary modeling of patient neoantigen profiles (16, 27). This work considered the neoantigen class I major histocompatibility complex (MHC) binding affinity and only retained the top neoantigens resulting from missense mutations with the highest binding affinity within each tumor clone. This metric demonstrated excellent predictive power for survival of patients after immunotherapy treatment in a few cohorts; however, its predictive values and prognostic values have not been widely evaluated.

In this study, we developed the Cauchy-Schwarz index of Neoantigens (CSiN) to enable a quantitative characterization of the delicate internal structure of patient tumor neoantigen profiles. We assembled 10 immunotherapy-treated patient cohorts and 6 patient cohorts with baseline survival information, both from public sources and our own cohorts. In our unbiased analyses, CSiN achieved almost uniformly significant results on the checkpoint inhibitor cohorts and the baseline cohorts of immunogenic cancers, which is significantly better than the performance of previously described neoantigen-based biomarkers. Together, our work fills an important “void” in immuno-oncology research that resulted from prior methods that have not sufficiently integrated data on neoantigen clonal structures.

RESULTS

Constructing the CSiN to characterize the fine structure of patient neoantigen repertoire

Some somatic mutations are truncal and others are subclonal. Truncal mutations are shared among more tumor clones (i.e., founding and derivative clones) and, if targeted by T cells through neoantigens, will likely result in a stronger cytotoxic effect. Subclonal mutations are unique to different clones, and if a subclonal mutation is in a clone with a larger clonal fraction, then the neoantigens associated with this mutation are likely to have a stronger effect on the survival of the tumor cells than subclonal mutations associated with minor clones. Besides, each somatic mutation could generate a different number of neoantigens of different peptide lengths (8 to 11 amino acids for class I MHC–binding peptides and 15 amino acids for class II MHC–binding peptides), with different registers (a sliding window of protein segments around the mutated position) and presented by different human leukocyte antigen (HLA) alleles (class I and class II). Insertions and deletions usually generate a higher number of neoantigens per mutation than missense mutations as they lead to the translation of completely new segments of protein sequences (shown in the Supplementary Materials). We hypothesize that a favorable distribution of neoantigens and tumor mutations occurs when immunogenic neoantigens are concentrated on truncal mutations (Fig. 1A), and we developed CSiN to test this prediction.

Fig. 1 Motivation for CSiN score.

(A) Illustration showing the motivation of examining pairings of neoantigens and the tumor mutations with which they are associated. We demonstrated two hypothetical patients, one with an unfavorable distribution of neoantigens and the other with a favorable distribution. The actual mutations and neoantigens shown are based on real data. The outermost circle indicates the whole tumor. Each circle indicates a population of tumor cells with certain mutations. Each different color indicates a distinct mutation, and the area of each circle indicates the proportion of cells having the mutation. For the formula, on the left of each multiplication sign “×” is the normalized VAF, and on the right of each × is the normalized permutation neoantigen load. The colorings in the formula correspond to the tumor mutations shown above with the same colorings. The two bigger tables on the right show the neoantigen sequences, registers (“Dist”), and the HLA alleles for each neoantigen. For neoantigens of missense mutations, Dist refers to the distance between the altered amino acid and the left end of neoantigen; for neoantigens of insertions/deletions and stop-loss mutations, Dist refers to the distance between the left end of the mutation and the left end of neoantigen. The “+” sign indicates the left end of neoantigen is on the right of the altered position and vice versa. (B) The distribution of the CSiN scores in the RCC, LUAD, LUSC, and SKCM cohorts. The t tests were used for comparison of CSiN scores between different subtypes of the same tumor cohort. WT, wild-type. (C) A scatterplot showing the relationship between CSiN and the expression level of the IFN-γ signature in the RCC cohort. Spearman correlation between them is shown. (D) Heatmaps of the pairwise Spearman correlations of the CSiN, mutation load, neoantigen load, and the transcriptomics-based features are shown for the RCC, LUAD, LUSC, and SKCM cohorts, which are calculated as in (C).

The core of CSiN is based on the mean of the product of the variant allele frequency (VAF) of each somatic mutation and the number of neoantigens associated with that mutation. When the mutations with higher VAFs are also the mutations that generate more neoantigens (favorable distribution), the product value will be larger (a higher CSiN score). Therefore, a higher CSiN conforms to a favorable neoantigen clonal structure. CSiN is so named because it bears analogy to the Cauchy-Schwarz inequality, which states that the inner product of two vectors is maximal when they are in parallel (in the same ranked order). We also considered the immunogenicity of neoantigens. As there are currently no well-accepted tools for predicting the potential of both class I and class II neoantigens to induce T cell responses (28), we used the binding affinity of peptides to MHCs, which has been shown to be one of the most important predictors of neoantigen immunogenicity (29). We set a series of cutoffs on the peptide-MHC binding affinity strengths predicted by the Immune Epitope Database and Analysis Resource (IEDB) tools (30, 31) to generate several subsets of neoantigens with increasing stringency. We then calculated the means of the product of VAF and permutation neoantigen load (neoantigens of both class I and class II, all HLA alleles, and all registers from one mutation) for each subset of neoantigens. The final CSiN score is the arithmetic mean of these subindices, in which neoantigens with better MHC binding affinity have higher weights (for details, please refer to the Supplementary Materials). The distributions of the CSiN scores in renal cell carcinoma (RCC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), and skin cutaneous melanoma (SKCM), which are all immunogenic tumors (32), are shown in Fig. 1B.

We assessed whether CSiN characterizes unique information that is not captured by the commonly used mutation load and neoantigen load approaches (both classes, all possible lengths, and all possible registers) or by candidate transcriptomics-based predictive biomarkers. For transcriptomic features, we examined the expression of PDCD1 (encoding PD-1), CD274 (encoding PD-L1), CTLA4 (33), CD8A, and an interferon-γ (IFN-γ) gene signature (34). Figure 1C shows a scatter plot of the CSiN scores with the activation [single-sample gene set enrichment analysis (ssGSEA)] of the IFN-γ signature in the RCC cohort, which yielded a Spearman correlation of 0.067. Figure 1D visualizes these correlations in heatmaps for pairwise comparisons of these biomarkers in each cohort, which shows that CSiN is an independent biomarker (Spearman correlation < 0.1 for all comparisons). Further analyses, using the Pearson correlation, threshold comparisons, and mutual information, again demonstrated the independence of the CSiN score (see the Supplementary Materials).

Better response to checkpoint inhibitors in immunogenic cancers is associated with higher CSiN scores

Next, we investigated the implications of CSiN for checkpoint inhibitor treatment response. We analyzed the neoantigen profiles of melanoma patients on anti–CTLA-4 therapy from Van Allen et al. (35). We observed patients with better responses were more likely to have high CSiN (i.e., higher than median) than patients with worse responses (Fig. 2A, P = 0.009). We analyzed another cohort of melanoma patients on anti–CTLA-4 therapy from Snyder et al. (36). We observed that patients who received a durable clinical benefit (abbreviated DCB and defined as complete or partial response or stable disease for >6 months) had higher CSiN scores than patients with no durable benefit (abbreviated NDB and defined as stable disease for <6 months or progressive disease; Fig. 2B, P = 0.033). We analyzed a third cohort of melanoma patients (Riaz cohort) on anti–PD-1 therapy (15). We examined whether patients with better treatment responses had higher CSiN scores. There was a significant positive association in this cohort (Fig. 2C, P = 0.037). We analyzed one more cohort (Hugo) of patients with melanoma from Hugo et al. (14). Figure 2D shows that there is an overall trend of patients with a better response associated with higher CSiN scores (P = 0.043). In clear cell RCC (ccRCC), we examined anti–PD-1/anti–PD-L1–treated ccRCC patients from Miao et al. (17). The same significantly positive association of higher CSiN scores with better response was observed (Fig. 2E, P = 0.036). We analyzed metastatic ccRCC patients treated with atezolizumab, an anti–PD-L1 agent (IMmotion150 cohort) (37). We found that there was a significant association of higher CSiN with better treatment response for T effector (Teff)–high patients treated with atezolizumab (Fig. 2F, P = 0.028, and the Supplementary Materials). In contrast, we did not observe this association for Teff-high patients treated with sunitinib (P = 0.890). For patients with lower Teff signature expression, there was no significant association for either atezolizumab or sunitinib. NSCLC patients (the Hellmann cohort) treated with PD-1 and CTLA-4 inhibitors were available from Hellmann et al. (18). Our analyses showed that PD-L1+ patients with DCB had higher CSiN scores than patients with NDB (Fig. 2G, P = 0.007), whereas this association is insignificant for patients with low PD-L1 expression. We examined another NSCLC cohort (the Acquired cohort) from Anagnostou et al. (2) and Gettinger et al. (22). All these patients achieved partial response, except for one patient who exhibited stable disease after initial checkpoint inhibitor treatment. All patients developed acquired resistance after 4 to 40 months. Patients with sustained response were more likely to have higher CSiN scores than patients with short-term progression (Fig. 2H, P = 0.015). Last, another cohort of NSCLC patients on anti–PD-1 therapies from Rizvi et al. (4) was analyzed. In Fig. 2I, we showed that more patients with DCB had higher CSiN scores than patients with NDB (P = 0.058). The false discovery rates for all the above mentioned cohorts fall under 10% (table S3).

Fig. 2 Association of CSiN score with checkpoint inhibitor treatment response.

(A) The Van Allen cohort. Eleven patients with clinical benefit (response group), 6 patients with long-term survival (long-survival) group, and 20 patients with minimal or NDB (nonresponse) group. (B) The Snyder cohort. Twenty-seven patients with DCB, and 34 patients with NDB. (C) The Riaz cohort. Three patients with complete response (CR), 12 patients with partial response (PR), 23 patients with stable disease (SD), and 27 patients with progressive disease (PD). (D) The Hugo cohort. Three patients with complete response, 10 patients with partial response, and 13 patients with progressive disease. (E) The Miao cohort. Twelve patients with clinical benefit, 8 patients with intermediate benefit, and 13 without clinical benefit. (F) The IMmotion150 cohort. There were 8 patients with CR, 15 patients with PR, 16 patients with SD, and 16 patients with PD. These patients were treated with atezolizumab and have high Teff signature expression. (G) The Hellmann cohort. There were 23 PD-L1+ (IHC ≥ 3) patients with DCB, and 16 PD-L1+ patients with NDB. (H) The Acquired cohort. There were 8 patients with sustained response (progression < 12 month) and 6 patients with short-term progression (progression > 12 month). (I) The Rizvi cohort. Eleven patients with DCB and 15 patients with NDB. Biopsy and genomics data were obtained close to time of progression for all patients, whereas baseline biopsies were lacking for many patients. For (A) to (I), we tested the association of the dichotomized CSiN scores with the ordered response categories using an ordinal χ2 test. (J) Boxplots of bootstrap P values evaluating the robustness of the predictive performance of CSiN, neoantigen load, and the neoantigen fitness score, with each P value generated from a bootstrap resample of each cohort. Two-sided Wilcoxon signed-rank test was used to compare the bootstrap P values. ***P = 0.0001 to 0.001 and ****P < 0.0001.

To compare the performance of CSiN score with other widely used metrics for neoantigenicity, we also examined the predictive power of neoantigen load (fig. S1) and the neoantigen fitness model (fig. S2) in the same cohorts and by the same statistical tests. For consistency, we used the median scores to split the cohorts. For neoantigen load, we also adopted another cutoff (median + 2× interquartile range), developed by Zehir et al. (38) (results shown in the Supplementary Materials). For this analysis, the neoantigen fitness model was calculated for both class I and class II neoantigens, as was done for CSiN score and neoantigen load. We provided the results of the neoantigen fitness scores calculated using class I 9-mer neoantigens from missense mutations in the Supplementary Materials, as was done in its original report. Overall, the neoantigen load and neoantigen fitness models were not as strongly predictive of treatment response as CSiN. We used bootstrap analysis to evaluate the statistical significance of the improvement of CSiN compared with the other two approaches, which is an accepted methodology for model comparisons (39, 40). In Fig. 2J, we show that CSiN significantly outperformed neoantigen load in seven of the nine cohorts evaluated and neoantigen fitness in 7 of the 9 cohorts. Overall, our results show that CSiN is capable of predicting clinical response to checkpoint inhibitors in immunogenic cancers and demonstrated a significant improvement over existing predictive tools. We also compared the predictive power of CSiN, neoantigen load, and neoantigen fitness models using a survival analysis criterion and observed the same trends. Detailed results are shown in the Supplementary Materials (fig. S13).

Last, we also explored the predictive power of CSiN in other forms of immunotherapies. We generated genomics data for an in-house cohort [the interleukin-2 (IL-2) cohort] of ccRCC patients treated with concurrent IL-2 and stereotactic ablative body radiation (SAbR) treatment. IL-2 has pleiotropic activating effects on cytotoxic T cells, regulatory T cells, and natural killer cells (41). It has been shown that SAbR has multiple immunogenic properties and could enhance the response to IL-2 (42). The neoantigen-mediated cytotoxicity probably partially explains the effects of this regimen. In this cohort, CSiN scores of patients with DCB are higher than CSiN scores of patients with no durable clinical benefit (NDB) with marginal significance (fig. S3A, P = 0.053) and outperform the neoantigen load and the neoantigen fitness models (fig. S3, B and C).

Higher CSiN score predicts more favorable prognosis in immunogenic cancers

To understand the implications of neoantigen heterogeneity for long-term survival of patients, we examined the association between CSiN and prognosis in the RCC, LUAD, LUSC, and SKCM cohorts. We first focused on patients with high levels of T cell infiltration, profiled by our recently published empirically defined tumor microenvironment gene expression signatures (32). We speculated that the neoantigen T cell axis is more likely to be functionally active when T cell infiltration is present in the tumor. In these patients, we indeed observed that higher CSiN scores had a significantly positive association with better survival for RCC (Fig. 3A, P = 0.01), LUAD (Fig. 3B, P = 0.036), LUSC (Fig. 3C, P = 0.024), and SKCM (Fig. 3D, P = 0.038). The false discovery rates for the four cohorts fall under 10% (table S3). However, the overall survival of patients with lower T cell infiltration was indifferent to the levels of CSiN scores, which fits our speculation. We extracted and combined the high T cell infiltration patients from all four cohorts and carried out survival analyses, which again showed that patients with higher CSiN scores had a significantly better overall prognosis (Fig. 3E, P = 3.8 × 10−5). To further exclude the effect of clinical confounders, we performed multivariate survival analysis adjusted by disease type, stage, gender, and age in this combined cohort. The significant association between survival and CSiN was retained (Fig. 3F, P < 0.001).

Fig. 3 Association of CSiN score with overall survival of patients.

(A to E) Kaplan-Meier estimator was used to visualize patient overall survival. P values for log-rank tests are shown. (A) The RCC cohort. (B) The LUAD cohort. (C) The LUSC cohort. (D) The SKCM cohort. (E) The patients identified as having “High T cells” are extracted from each cohort, combined, and tested together. The high and low CSiN score designations follow those in (A to D). (F) Forest plot for the coefficients of the multivariate Cox proportional hazards analysis of the combined cohort in (D). Disease type, pathological stage, gender, age, and the binarized CSiN were included as covariates. The dashed line shows the no effect point. Confidence intervals (95%) were shown as bars. (G) Boxplots of bootstrap P values evaluating the robustness of the prognostic performance of CSiN, neoantigen load, and the neoantigen fitness score, with each P value generated from a bootstrap resample of each cohort. Two-sided Wilcoxon signed-rank test was used to compare the bootstrap P values. ****P < 0.0001.

In contrast, the same analysis for the neoantigen load and the neoantigen fitness models yielded insignificant associations (figs. S4 and S5). We also used bootstrap analysis to evaluate the statistical significance of this comparison. In Fig. 3G, we showed that CSiN significantly outperformed both methods in all four cohorts evaluated. Overall, in concordance with several previous studies that reported a lack of association of higher neoantigen load with better prognosis in several cancer types (1921), our results suggest that the clonal distribution of neoantigens could be more important prognostically.

We also assessed nonimmunogenic cancer types. Pediatric acute lymphocytic leukemia (ALL) is an aggressive childhood tumor type with low neoantigen load. We evaluated a cohort of patients with pediatric ALL (pALL cohort) and observed that CSiN was not predictive of prognosis (fig. S6A, P = 0.584), with the results for neoantigen load shown in fig. S6B and neoantigen fitness shown in fig. S6C. We also studied patients with liver hepatocellular carcinoma (LIHC) from The Cancer Genome Atlas (TCGA) (LIHC cohort), although whether liver cancer is an immunogenic cancer type is still under debate (43, 44). We observed that patients with higher CSiN scores had a nonsignificant trend of better survival than patients with low CSiN scores in the high T cell infiltration subset of patients (fig. S6D, P = 0.165), with the results for neoantigen load shown in fig. S6E and neoantigen fitness shown in fig. S6F. These observations further confirmed that the predictive value of CSiN score is more apparent in tumor types with higher immunogenicity.

DISCUSSION

The major biological insight from this study is that the neoantigen clonal structure in each tumor specimen and the immunogenicity of the neoantigens (represented by the MHC-binding strength in our study) are predictive of response to checkpoint inhibitors and prognosis. These factors may significantly outweigh the impact of the raw neoantigen count. Our comprehensive analyses show that the CSiN score, which describes these properties of the neoantigen profile quantitatively, has substantially better predictive and prognostic performance than other neoantigen-based biomarkers in most of the evaluated cohorts. Our implementations of the CSiN, neoantigen load, and neoantigen fitness indices have considered both MHC class I and class II neoantigens and also neoantigens generated from insertions/deletions and stop-loss mutations. This is different from the original publication of the neoantigen fitness model (16, 27) that only considered 9-mer class I neoantigens generated from missense mutations. We believe that inclusion of all these potential sources of neoantigens is important for a complete characterization of the neoantigen profiles in each patient (analyses for each class of neoantigens only are shown in the Supplementary Materials). In alignment with our findings, McGranahan et al. (13) made a qualitative observation that CTLA-4–resistant tumors could be enriched for subclonal mutations, which may enhance total neoantigen burden but not elicit an effective antitumor response due to the subclonal nature of these neoantigens. Miao et al. (26) also made a similar observation. Our study is distinguished from these earlier reports in that we provided a robust quantitative measurement that was subjected to systematic evaluations, and we also evaluated prognosis in addition to treatment response. Overall, CSiN may serve as a valuable predictive tool for medical oncologists treating patients with checkpoint blockade and has addressed some of the limitations of prior neoantigen-based predictive biomarkers.

CSiN incorporates genetic information that is not captured by the neoantigen load and neoantigen fitness models or by expression-based biomarkers. A number of expression-based biomarkers for immunotherapy have been proposed and validated on different levels, including PD-L1 expression in the tumor microenvironment (45, 46), T helper 1–type chemokine expression (47), and T cell infiltration (48), but CSiN augments and complements, rather than replaces, these biomarkers. We show that CSiN is associated with longer survival only in patients with sufficient T cell infiltration. Similarly in the treatment cohorts, the P values for testing the correlation between CSiN and treatment response in the high T cell/high PD-L1 patient subsets are generally smaller compared with the P values of the full cohorts (Fig. 2 and the Supplementary Materials), although the subset cohorts are much smaller in sample size. These results suggest that it is crucial for all components of the neoantigen-host immune axis to be functionally active, to enable efficient immune elimination of tumor cells. The insignificant association of CSiN with prognosis in the cohorts with cases of less immunogenic tumors including liver cancers and pediatric ALL also supports this notion. Our observations may inspire potential future studies to construct more sophisticated, predictive, and prognostic models that incorporate CSiN, neoantigen load, the neoantigen fitness model, and other biomarkers together for improved performance.

Our results reporting the positive correlation between neoantigens and treatment response in RCCs are interesting. Now, the field is still debating the role of neoantigens in the immune response to RCCs. While RCCs have low neoantigen/mutation loads, Turajlic et al. (49) found that RCCs have the highest number of insertion/deletion mutations on a pan-cancer basis, which tend to encode high-quality neoantigens. In terms of predicting survival after immunotherapy treatment, Samstein et al. (50) reported a marginally significant correlation between tumor mutation burden and progression-free survival, whereas this observation was not made in the phase 3 JAVELIN Renal 101 trial (51). Cherkasova et al. (52) found the reactivation of a group E human endogenous retrovirus in RCCs, which can encode an immunogenic peptide recognizable by cytotoxic T cells. It is highly likely that both neoantigens and self-antigens contribute to the immunogenicity in RCCs. In the future, it will be of interest to investigate in more and larger cohorts whether CSiN is more predictive of patients’ response to immunotherapies when such antigens have been incorporated in the calculation.

One limitation of our study and, to some extent, this field is that we used neoantigens predicted from genomics data for correlation with patient phenotypes. Despite our efforts to validate the neoantigen predictions, it is likely that there are still false-positive and false-negative predicted neoantigens that convoluted our analyses. In future studies, incorporating the genomics-based approach with other methods, such as mass spectrometry (53), may improve the sensitivity and specificity of neoantigen detection and thus further enhance the predictive power of CSiN.

Class I and class II neoantigens represent very different aspects of immune response. But several recent studies have implicated neoantigen-specific CD4+ T cells in direct tumor clearance (5456). Our results also suggest that the inclusion of class II neoantigens is important for treatment response prediction. In the future, it will be interesting to carry out comprehensive research into the roles of class II neoantigens in the tumor microenvironment.

The neoantigen repertoire is in constant dynamic evolution (2, 57), with immunoediting and immunotherapy treatments actively modifying its landscape. CSiN offers a new tool to monitor the neoantigen profiles, where different tumor clones could have different growth advantages subject to the pressure of T cell cytotoxicity determined by each clone’s neoantigen composition. Overall, our work offers a rigorous methodology of predicting response to immunotherapy and prognosis from routine patient samples and should be useful for personalizing medicine in the modern era of immunotherapy.

MATERIALS AND METHODS

Study design

The objective of this research project is to study the implication of the clonal structure of neoantigens for predicting treatment response and prognosis. The research subjects were individual patients with cancer, some of whom received immunotherapy. Because this is a retrospective analysis study, the researchers were not blinded to the allocation labels. We included all available samples and data from either public or private sources in our study. We stopped the data collection on a lock date of 1 September 2019. The endpoints considered were response categories and survival of patients with cancer after baseline assessment and, in some cases, immunotherapy. Usually, just one sample per patient was available. In the uncommon cases of more than one sample collected for an individual patient, we averaged the scores (e.g., CSiN) calculated from each sample.

The CSiN score and the other neoantigen-based metrics used for comparison

The CSiN score considers the pairing between the repertoire of neoantigens and the tumor mutations to which they belong. One way to characterize this property is to average the product of the VAFs of somatic mutations and the number of neoantigens generated by each mutation, normalized by the average VAF and average mutation-specific neoantigen load in each patient. The CSiN score is >1 under favorable pairing and vice versa. This forms the backbone of the final CSiN score. The name CSiN was selected because the pairing of tumor mutations and neoantigens and its effect on the overall score bear analogy to the Cauchy-Schwarz inequality, which describes the upper bound of the product sum of two vectors of real numbers and the condition for the equality to be achieved. Refer to the Supplementary Materials for the implementation details of CSiN and additional analyses involving CSiN, which are not shown in the main text. A cartoon showing the workflow of CSiN is shown in fig. S7.

As a comparison for CSiN scores, we also calculated the neoantigen load and the neoantigen fitness scores. Published studies have calculated neoantigen load in slightly different manners, due to various reasons such as usage of different software for mutation calling (17, 18, 22, 35, 58, 59). Our neoantigen load calculations were based on our own mutation calling and neoantigen calling pipelines when raw genomics data were available and followed the general standards for neoantigen load calculation. Specifically, we counted the total number of neoantigens of both class I and class II (whether class II neoantigens were available from our pipelines or from the original reports when raw genomics data were not available), all HLA alleles, and all registers from all missense, indel, and stop-loss mutations. The neoantigen fitness calculation was performed as described previously (16, 27) but included both class I and class II neoantigens (when class II neoantigens are available). In the Supplementary Materials, we have also shown the results for the neoantigen fitness scores calculated with only class I 9-mer neoantigens from missense mutations.

The QBRC mutation calling pipeline

We used the Quantitative Biomedical Research Center (QBRC) mutation calling pipeline for somatic mutation calling, developed in the QBRC of University of Texas (UT) Southwestern Medical Center. In short, whole exome-sequencing (exome-seq) reads were aligned to the human reference genome by BWA-MEM (60). Picard was used to add read group information, and sambamba was used to mark polymerase chain reaction duplicates. The GATK toolkit (6163) was used to perform base quality score recalibration and local realignment around indels. MuTect (64), VarScan (65), Shimmer, SpeedSeq (66), Manta, and Strelka2 (67) were used to call single-nucleotide polymorphisms (SNPs) and Indels. A mutation that was repeatedly called by any two of these software programs was retained. ANNOVAR was used to annotate SNPs and Indels, and protein sequence changes (68). All SNPs and Indels were combined and only kept if there were at least seven total (wild type and variant) reads in the normal sample and at least three variant reads in the tumor sample. Somatic mutations and germline mutations were annotated according to the VAFs in the normal and tumor samples.

The QBRC neoantigen calling pipeline

We used the QBRC neoantigen calling pipeline for neoantigen calling, which follows similar standards as those pipelines used in previous publications (17, 18, 22, 35, 58, 59). The validity of our neoantigen predictions is demonstrated in the Supplementary Materials. We kept only frameshift, nonframeshift, missense, and stop-loss mutations that would lead to protein sequence changes. We kept only somatic mutations whose VAFs were <0.02 in the normal sample and VAFs of >0.05 in the tumor samples. For class I HLA proteins (A, B, and C), we predicted putative neoantigens of 8 to 11 amino acids in length, and for class II HLA proteins (DRB1 and DQB1/DQA1), we predicted putative neoantigens of 15 amino acids in length. Class I and class II HLA subtypes were predicted by the ATHLATES tool (69), which has been shown to be accurate for these HLA alleles (70). Samples whose total successfully typed HLA alleles (counting both chromosomes) were <8 were regarded as poor-quality data and were left out from downstream analyses. Putative neoantigens with amino acid sequences exactly matching known human protein sequences were filtered out. For class I bindings, the IEDB-recommended mode (http://tools.iedb.org/main/) was used for prediction of binding affinities, whereas for class II binding, NetMHCIIpan embedded in the IEDB toolkit was used. Neoantigens were kept only if the predicted ranks of binding affinities were ≤2%. Tumor RNA sequencing (RNA-seq) data, if available, were aligned to the reference genome using the STAR aligner (71). featureCounts was used to summarize gene expression levels (72). Neoantigens whose corresponding mutations were in genes with an expression level of <1 RPKM (Reads Per Kilobase of transcript, per Million mapped reads) in either the specific exon or the whole transcript were filtered out. For the samples analyzed by our pipeline, we applied the filtering by RNA-seq data on the neoantigen list when RNA-seq data are available. We showed the results for calculating CSiN with only exome-seq data in the Supplementary Materials, which indicated that filtering the neoantigen list by RNA-seq data is important for the predictive performance of CSiN. Last, in accordance with our pipeline, Ott et al. (73) have shown that neoantigens of class I and class II, and also different registers, can all possibly be immunogenic, and IEDB (https://iedb.org/) has documented many immunogenic antigens of different peptide lengths.

Patient cohorts

For all patient cohorts, where approval of access to raw exome-seq and RNA-seq data was obtained, we predicted the somatic mutations and neoantigens using our in-house pipelines. In cases where raw genomics data were not accessible, we analyzed the processed neoantigen and mutation data included in the Supplementary Materials of the original publications. For exploratory analysis and association with overall survival, we collected data from 110 patients with RCC from the Kidney Cancer Program at UT Southwestern Medical Center (32), 94 patients with ccRCC from Sato et al. (23), and 162 patients with ccRCC from TCGA (74) (the RCC cohort); 427 patients with LUAD from TCGA (75) (the LUAD cohort); 389 patients with LUSC from TCGA (76) (the LUSC cohort); 401 patients with melanoma from TCGA (77) (the SKCM cohort); 103 pediatric and young adult T-lineage acute lymphoblastic leukemia patients (the pALL cohort) from Liu et al. (24); and 292 patients with liver cancer from TCGA (78) (the LIHC cohort) (table S1). The top 140 RCC patients, 100 LUAD patients, 100 SKCM patients, and 40 LUSC patients with the highest T cell infiltration were designated as having High T cells, so the more immunogenic tumor types have more patients selected (82).

For association of CSiN scores with immunotherapy response, 10 patient cohorts were collected that are summarized in table S1. Table S1 shows the total number of patients within each cohort used in this study for correlation of treatment response and neoantigen-based predictive markers. In the Van Allen cohort, there are 40 patients with matched RNA-seq and exome-seq data, and three of these patients were removed because of lack of response information. For the Riaz cohort, three patients were removed because two of them have lack of response information (“not evaluated” reported by the original study), and the third patient’s HLA alleles cannot be typed accurately from the sequencing data. In the Miao cohort, two patients were removed because of the HLA typing issue. In the Rizvi cohort, three patients were removed because of lack of response information (not reaching 6-month follow-up), and five patients were removed because of an HLA typing issue. In the Snyder cohort, three patients were removed because the neoantigen data were not available from the original publication (we used neoantigens called by the original study for this cohort because we do not have access to the raw genomics data). In the IMmotion150 cohort, 99 patients on atezolizumab and 50 patients on sunitinib had both exome-seq and RNA-seq. Three patients treated by atezolizumab and four patients treated by sunitinib were further removed because response information was not available. For the Hellmann cohort, the neoantigen lists were generated by the original authors and used by us in this work. There are 74 patients with neoantigen data generated. Only whole exome-seq was performed, and only about half of these patients consented to genomics data sharing, so we decided to use these neoantigens called by the original report. For the Hugo cohort, 28 patients had both exome-seq and RNA-seq data, and two were removed because of an HLA typing issue. Stratified analyses (Fig. 2 and the Supplementary Materials) were performed for the IMmotion150, Van Allen, Hugo, Riaz, and Miao cohorts (with RNA-seq data available) based on ssGSEA analyses (79) of the Teff gene signature (80) to focus on the top 60% patients, and also for the Hellmann cohort based on PD-L1 immunohistochemistry (IHC) level with a cutoff on IHC readings (which are given in integers by the original report for 69 patients) chosen such that the top subset is closest to 60% of the whole cohort. For cases in which multiple types of treatment outcomes were recorded in a cohort, we used the primary criterion used by the original publications. Patient samples that were analyzed but were not shown in Fig. 2 (e.g., IMmotion150 patients on the sunitinib arm) were shown in the Supplementary Materials.

Statistical analyses

All computations and statistical analyses were carried out in the R computing environment. For all boxplots appearing in this study, box boundaries represent interquartile ranges, whiskers extend to the most extreme data point which is no more than 1.5 times the interquartile range, and the line in the middle of the box represents the median. For association of CSiN score, neoantigen load, and the neoantigen fitness model with binary/categorical response variables, we tested the association of the dichotomized CSiN scores with the ordered response categories (e.g., complete response→partial response→stable disease→progressive disease) using an ordinal χ2 test. The three types of predictive scores are each split on the median within each cohort, to ensure comparability and avoid the scaling problem. The alternative hypothesis is that patients with better response and survival outcome are more likely to have higher CSiN scores. We used the chisq_test function in the R coin package for this purpose. For survival-type analysis, we adopted the log-rank test for evaluating whether patients with higher CSiN scores had better prognosis. T cell infiltrations and activation of the IFN-γ signature were predicted using the ssGSEA method (79). ssGSEA analysis was performed using the R GSVA package by calling the gsva function with parameter method="ssgsea" and rnaseq=T (81). The forest plot was performed by the forest_model function in the R forestmodel package. For model comparison, 5000 bootstrap resamples of each cohort were generated, and each resample was used to evaluate the predictive or prognostic performance of CSiN, neoantigen load, and the neoantigen fitness score. The P values of 5000 bootstraps of each approach were compared using two-sided Wilcoxon signed-rank test.

SUPPLEMENTARY MATERIALS

immunology.sciencemag.org/cgi/content/full/5/44/eaaz3199/DC1

Materials

Fig. S1. Predictive power of neoantigen load.

Fig. S2. Predictive power of the neoantigen fitness model.

Fig. S3. Association of CSiN (A), neoantigen loads (B), and neoantigen fitness (C) with IL-2/SAbR treatment response in ccRCC patients.

Fig. S4. Prognostic power of neoantigen load.

Fig. S5. Prognostic power of the neoantigen fitness model.

Fig. S6. Association of CSiN, neoantigen loads, and neoantigen fitness with prognosis of patients with pediatric ALL and patients with LIHC.

Fig. S7. Cartoon showing the workflow of calculation of CSiN scores.

Fig. S8. The CSiN plot for the primary tumor of XP397 from the UTSW KCP cohort is shown.

Fig. S9. Validity of neoantigen predictions.

Fig. S10. The average number of neoantigens generated by each type of mutations.

Fig. S11. Demonstrating independence of CSiN from mutation load/neoantigen and transcriptomic-based biomarkers.

Fig. S12. Association of CSiN with metastasis.

Fig. S13. Validating the predictive power of CSiN, neoantigen load and neoantigen fitness model using OS/PFS as the criterion.

Fig. S14. Assessing the intra-tumor heterogeneity of CSiN and neoantigen loads.

Fig. S15. Calculating CSiN with only exome-seq data.

Fig. S16. Using the median + 2 x interquartile range cutoff on neoantigen load.

Fig. S17. Predictive value of class I-specific CSiN and class II-specific CSiN.

Fig. S18. Predictive value of class I-specific neoantigen fitness model measured by survival analyses (limiting to 9-mers from missense mutations).

Fig. S19. Predictive value of class I-specific neoantigen fitness model measured by categorical response variables (limiting to 9-mers from missense mutations).

Fig. S20. Predictive value of CSiN for the patients with high Teff signature expression.

Fig. S21. Predictive value of CSiN for the patients treated by sunitinib and by atezolizumab in the IMmotion150 cohort.

Fig. S22. Predictive value of CSiN for all the patients in the Hellmann cohort.

Fig. S23. Boxplots showing distribution of CSiN scores in quartiles of tumor clone number determined by pyclone.

Table S1. The patient cohorts used in this study.

Table S2. Processed mutation, expression and neoantigen data of the IL-2 cohort (in Excel spreadsheet).

Table S3. P values and false discovery rates of the tested cohorts shown in Figs. 2 and 3.

Data file S1. Raw data file for Figs. 1 to 3 (in Excel spreadsheet).

REFERENCES AND NOTES

Acknowledgments: We acknowledge J. Norris for helpful comments on the writing of the paper. We acknowledge Y. Wang for helpful comments on the scientific contents. We acknowledge the Wakeland sequencing core of UT Southwestern for generating genomics data for the IL-2 cohort. We acknowledge the authors of phs000452.v2.p1, phs001493.v1.p1, phs001464.v1.p1, phs000218.v1.p1, and phs000464.v15.p7 datasets, as well as the funding agencies that supported these studies and dbGaP that supported the archival of these datasets. Funding: This study was supported by grants from the NIH (R03 ES026397-01 to T.W., SPORE 1P50CA19651601 to J.B. and T.W., and CCSG 5P30CA142543 to T.W.), the Cancer Prevention Research Institute of Texas (CPRIT RP180319 to L.X. and CPRIT RP190208 to T.W.), and American Cancer Society (RSG-16-004-01-CCE to R.H.). Author contributions: T.L., S.W., Q.Z., and Z.X. contributed to computational analyses. T.W. contributed to the overall supervision of the project. L.X. and J.B. contributed to data access. S.M., L.P., M.C., and R.H. contributed to the collection of samples and generation of sequencing data for the IL-2 cohort. L.X., N.S., J.G., J.J.L., J.B., R.H., and T.W. contributed to study design and manuscript writing. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The QBRC mutation calling and neoantigen calling pipelines are available on GitHub: https://github.com/tianshilu/QBRC-Somatic-Pipeline and https://github.com/tianshilu/QBRC-Neoantigen-Pipeline. The CSiN calculations were embedded in the neoantigen pipeline. Table S1 lists the data source and/or accession methods of these studies. The raw RNA-seq and exome-seq data of the IL-2 cohort patients can be downloaded from the European Genome-phenome Archive with accession number EGAS00001003605 through controlled access. The processed mutation and neoantigen data of the IL-2 cohort can be found in table S2.

Stay Connected to Science Immunology

Navigate This Article