Research ArticleANTIGEN PRESENTATION

A subset of HLA-I peptides are not genomically templated: Evidence for cis- and trans-spliced peptide ligands

See allHide authors and affiliations

Science Immunology  12 Oct 2018:
Vol. 3, Issue 28, eaar3947
DOI: 10.1126/sciimmunol.aar3947
  • Fig. 1 The nature of cis- and trans-spliced peptides and their identification from HLA immunopeptidomes sequenced by mass spectrometry.

    (A) Cartoon representation of (left) cis-spliced and (right) trans-spliced peptide generation. Cis-spliced peptides are formed after cleavage and ligation of segments from the same source protein; trans-spliced peptides are formed after ligation of cleaved segments from two different source proteins. Such peptides may then be subjected to HLA antigen presentation pathways for display at the cell surface. (B) Workflow for the identification of linear and spliced peptides. From high-quality MS/MS spectra, an initial de novo–assisted database search (using the reference human proteome) was carried out, filtering the data at a 1% FDR. Subsequently, all high-quality de novo–only sequenced peptides (the top five sequences per spectrum) that fell below this threshold were searched using our in-house algorithm to hierarchically rank peptides as to whether they had a linear > cis > trans explanation (or, failing this, were discarded). The top-ranked peptides for each spectrum sequence were then built into a custom FASTA-formatted database and merged with the human proteome, and the original MS/MS data were researched, taking the 1% FDR cutoff as a final output of results.

  • Fig. 2 Identification, length distribution, and motif analysis of linear and spliced peptides by a combined de novo library searching hybrid workflow approach.

    (A and B) More than 50,000 peptides eluted from eight HLA-A–expressing and nine HLA-B–expressing monoallelic cell lines were sequenced and defined as either linear or spliced in origin. (C) Proportion of linear, cis-, or trans-spliced peptides contributing to each HLA allelic dataset. (D) Length [number of amino acid (aa)] distribution of all identified linear and spliced peptides (****P < 0.0001, two-way multiple-comparison ANOVA test). (E) Motif analysis for 9-mer and 10-mer linear and spliced peptides, showing the percentage of enriched amino acids (if greater than 10%) at each of positions P2, P3, and PΩ. (Note that in spliced peptides, L stands for both leucine and isoleucine.) (F) Pearson r value correlation between the amino acids enriched in linear and spliced peptides for each allele at each of positions P2, P3, and PΩ (all data were P < 0.05).

  • Fig. 3 Validation of identified cis- and trans-spliced peptides from C1R-HLA-B*57:01 cells.

    (A) LC-MS/MS spectra from two cis- and four trans-spliced B*57:01 peptides, comparing eluted (upper panel of each peptide) and synthetic (lower panel for each peptide). Characteristic y and b ions are highlighted for each peptide. Data are displayed as relative intensity. m/z, mass/charge ratio. (B) Pearson r value correlation score of spectra from 28 eluted B*57:01 spliced peptides compared with their synthetic versions. (C) Stabilization of cellular HLA-B*57:01 molecules (T2-B*57:01 cell line) by a panel of synthetic spliced peptides and the EBV-derived B*57:01-restricted positive control peptide IALYLQQNW. No peptide [dimethyl sulfoxide (DMSO)] and the non-B*57:01–restricted peptide YLNEKAVSY were used as negative controls. Peptides were tested at the indicated concentrations, and surface HLA was detected by flow cytometry using the B*57:01-specific antibody 3E12. Data show median fluorescent intensity (MFI) from three independent replicates, and error bars show mean and SD. (D) Conformations of four spliced peptides (cis-LALLTG + VRM, trans-TSMSF + VPRPW, trans-GSFDY + SGVHLW, and trans-LSDSTA + RDVTW) presented by HLA-B*57:01. Cartoon representation of the peptide-binding groove of HLA-B*57:01 (cyan). The β2 helix has been rendered transparent to improve visibility of the peptide. Spliced peptides are represented as colored sticks, with the N-terminal segment in yellow and the C-terminal segment in purple, with individual residues as indicated. Note that, for peptide GSFDYSGVHLW, there were two possible explanations for the segments, GSFDY + SGVHLW or GSFDYS + GVHLW, and the former is indicated.

  • Fig. 4 Spliced junction amino acid bias and analysis of donor segment frequency.

    (A) A subset (3029) of spliced peptides (from all 17 analyzed HLA-A and HLA-B alleles) with only one possible splicing explanation were assessed for amino acid bias at the P1 and P1′ positions (left). The central amino acid pairs from 31,267 identified linear peptides (middle) and adjacent amino acids from the center of 31,000 randomly generated (conforming to the amino acid frequency distribution of the human proteome) peptide sequences (right) were used for comparison. Heat map frequency colors are as indicated per dataset, and amino acids are colored according to broad physiochemical characteristics. All Ile residues were substituted for Leu. (B) Number of occurrences for each possible segment of a dataset of 5806 spliced nonamers, calculated from the UniProt reference human proteome. (C) Permutations, calculated from multiplying together the numbers of occurrences for each given segment, for generating each of the same set of 5806 spliced nonamers. For (B) and (C), data show box plots with whiskers set to the 1 to 99 percentile.

  • Fig. 5 Cartoon model for the increased p-HLA display engendered by peptide splicing.

    Although conventional, linear peptides allow sampling of (for any given HLA allele) limited regions of the proteome, we propose that the combined actions of cis- and trans-splicing enable a greater proportion of the cellular proteome to be displayed for scrutiny by T cells.

Supplementary Materials

  • immunology.sciencemag.org/cgi/content/full/3/28/eaar3947/DC1

    Methods

    Fig. S1. De novo library hybrid workflow for identification of cis- and trans-spliced p-HLA.

    Fig. S2. De novo sequencing algorithm evaluation for p-HLA identification.

    Fig. S3. Motif analysis of 9– and 10–amino acid length peptides of p-HLA eluted from 17 different monoallelic cell lines.

    Fig. S4. NetMHC binding prediction of linear and spliced peptides.

    Fig. S5. Comparison of MS/MS spectra of synthetic peptides versus their corresponding eluted peptide.

    Fig. S6. Relative quantification of spliced and linear p-HLA eluted from C1R-B*57:01 cells.

    Fig. S7. Ratio of observed versus expected paired amino acids in spliced junctions.

    Fig. S8. The effect of adding PTMs to the library search in the de novo library hybrid workflow on the identification of spliced peptides.

    Table S1. Percentage of peptides (linear or spliced) matching to RNA-seq data.

    Table S2. Pearson correlation information for comparison between synthetic peptides and corresponding eluted p-HLA from C1R-B*57:01 cells.

    Table S3. Data collection and refinement statistics for p-B*57:01 crystal structures.

    Table S4. Sequences of 8-mer to 12-mer of linear and spliced peptides for all 17 allelic datasets.

    Table S5. Raw data file.

    References (3951)

  • Supplementary Materials

    The PDF file includes:

    • Methods
    • Fig. S1. De novo library hybrid workflow for identification of cis- and trans-spliced p-HLA.
    • Fig. S2. De novo sequencing algorithm evaluation for p-HLA identification.
    • Fig. S3. Motif analysis of 9– and 10–amino acid length peptides of p-HLA eluted from 17 different monoallelic cell lines.
    • Fig. S4. NetMHC binding prediction of linear and spliced peptides.
    • Fig. S5. Comparison of MS/MS spectra of synthetic peptides versus their corresponding eluted peptide.
    • Fig. S6. Relative quantification of spliced and linear p-HLA eluted from C1R-B*57:01 cells.
    • Fig. S7. Ratio of observed versus expected paired amino acids in spliced junctions.
    • Fig. S8. The effect of adding PTMs to the library search in the de novo library hybrid workflow on the identification of spliced peptides.
    • Table S1. Percentage of peptides (linear or spliced) matching to RNA-seq data.
    • Table S2. Pearson correlation information for comparison between synthetic peptides and corresponding eluted p-HLA from C1R-B*57:01 cells.
    • Table S3. Data collection and refinement statistics for p-B*57:01 crystal structures.
    • Legends for table S4 and table S5
    • References (3951)

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    • Table S4 (Microsoft Excel format). Sequences of 8-mer to 12-mer of linear and spliced peptides for all 17 allelic datasets.
    • Table S5 (Microsoft Excel format). Raw data file.

    Files in this Data Supplement:

Stay Connected to Science Immunology

Navigate This Article