Research ArticleCORONAVIRUS

SARS-CoV-2 drives JAK1/2-dependent local complement hyperactivation

See allHide authors and affiliations

Science Immunology  07 Apr 2021:
Vol. 6, Issue 58, eabg0833
DOI: 10.1126/sciimmunol.abg0833

COVID-19 hyperactivates complement

The complement system is a series of innate immune system proteins that help antibodies and phagocytes identify and eliminate pathogens. Activation of the complement system correlates with COVID-19 severity, but the cells that produce complement and potential treatments to inhibit complement activation during SARS-CoV-2 infection are not known. Here, Yan et al. used transcriptomics from patient bronchoalveolar lavage and infection models to demonstrate that SARS-CoV-2 infection induced complement-related genes and the activated complement component C3a in respiratory epithelial cells. C3a production was tied to IFN-induced JAK1/2-STAT1 signaling. Inhibition of JAK1/2 with ruxolitinib, with or without an antiviral, or a cell-permeable complement inhibitor repressed C3a production in SARS-CoV-2–infected epithelial cells. Thus, the use of JAK1/2 inhibitors may inhibit pathologies in patients with severe COVID-19.


Patients with coronavirus disease 2019 (COVID-19) present a wide range of acute clinical manifestations affecting the lungs, liver, kidneys, and gut. Angiotensin-converting enzyme 2 (ACE2), the best-characterized entry receptor for the disease-causing virus SARS-CoV-2, is highly expressed in the aforementioned tissues. However, the pathways that underlie the disease are still poorly understood. Here, we unexpectedly found that the complement system was one of the intracellular pathways most highly induced by SARS-CoV-2 infection in lung epithelial cells. Infection of respiratory epithelial cells with SARS-CoV-2 generated activated complement component C3a and could be blocked by a cell-permeable inhibitor of complement factor B (CFBi), indicating the presence of an inducible cell-intrinsic C3 convertase in respiratory epithelial cells. Within cells of the bronchoalveolar lavage of patients, distinct signatures of complement activation in myeloid, lymphoid, and epithelial cells tracked with disease severity. Genes induced by SARS-CoV-2 and the drugs that could normalize these genes both implicated the interferon-JAK1/2-STAT1 signaling system and NF-κB as the main drivers of their expression. Ruxolitinib, a JAK1/2 inhibitor, normalized interferon signature genes and all complement gene transcripts induced by SARS-CoV-2 in lung epithelial cell lines but did not affect NF-κB–regulated genes. Ruxolitinib, alone or in combination with the antiviral remdesivir, inhibited C3a protein produced by infected cells. Together, we postulate that combination therapy with JAK inhibitors and drugs that normalize NF-κB signaling could potentially have clinical application for severe COVID-19.


Coronavirus disease 2019 (COVID-19), a viral pneumonia caused by a beta coronavirus named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is now a pandemic. Patients with COVID-19 present variable clinical symptoms, ranging from a mild upper respiratory tract illness to a severe disease with life-threatening complications, characterized by combinations of acute respiratory distress syndrome (ARDS); coagulopathy; vasculitis; and kidney, liver, and gastrointestinal injuries (1). Survivors, and those with milder presentations, may suffer from loss of normal tissue function due to persistent inflammation and/or fibrosis (2, 3). The pathogenesis of COVID-19 and the causes of its variable severity are poorly understood; thus, a better mechanistic understanding of the disease will help identify at-risk patients and allow for the development and refinement of much-needed treatments.

The complement system is an evolutionarily conserved component of innate immunity, required for pathogen recognition and removal (4). The key components are complement 3 (C3) and C5, which circulate in their proenzyme forms in blood and interstitial fluids. C3 is activated through the classical (antibody signal), lectin (pattern recognition signal), and/or alternative (altered-self and tick-over) pathways into bioactive C3a and C3b via cleavage by an enzyme complex called C3 convertase. Complement factor B (CFB) is a key component of the alternative pathway C3 convertase. C3b generation triggers subsequent activation of C5 into C5a and C5b, with the latter seeding the formation of the lytic membrane attack complex on pathogens or target cells. C3a and C5a are anaphylatoxins and induce a general inflammatory reaction by binding to their respective receptors, C3a receptor (C3aR) and C5aR1, expressed on immune cell. C3b binds its canonical receptor, CD46, which is expressed on nucleated cells and acts as both a complement regulator and a driver of T helper 1 differentiation in CD4+ T cells (5, 6). Although the traditional view of complement is as a hepatocyte-derived and serum-effective system, the complement system is also expressed and biologically active within cells (7).

Patients with severe COVID-19 have high circulating levels of terminal activation fragments of complement (C5a and sC5b-9) (810), which correlate to disease severity (8). Single-nucleotide variants in two complement regulators, decay accelerating factor (CD55) and complement factor H, are risk factors for morbidity and mortality from SARS-CoV-2 (11). This is concordant with a recent report, which shows that serum C3 hyperactivation is an independent risk factor for in-hospital mortality (12). Despite these reports, the mechanisms behind the overactivation and conversion of the normally protective complement system into a harmful component of COVID-19 are currently unclear.

Here, we examined the transcriptomes of respiratory epithelial cells infected with SARS-CoV-2 and found that the complement system was one of the intracellular pathways most highly induced in response to infection. C3 protein was processed to active fragments by expression of an inducible alternative pathway convertase (CFB) and that was normalized by a cell-permeable inhibitor of CFB. Interferon (IFN) signaling via the Janus kinase 1/2 (JAK1/2)–signal transducer and activator of transcription 1 (STAT1) pathway was principally responsible for transcription of complement genes in this setting, and ruxolitinib, a JAK1 inhibitor, alone or in combination with remdesivir, an antiviral agent, normalized this transcriptional response and production of processed C3 fragments from infected cells.


SARS-CoV-2 infection activated complement transcription in lung epithelial cells

To gain insights into the pathophysiologic mechanisms of COVID-19, we sourced bulk RNA sequencing (RNA-seq) data from lung tissues of two patients with SARS-CoV-2 infection and uninfected controls (table S1A) (13). We compared the transcriptomes of patients with controls using gene set enrichment analysis (GSEA) (14) and found 36 canonical pathways curated by the Molecular Signatures Database (MSigDB) to be induced in patients compared with controls (Fig. 1A and table S1B). Five of the 36 (14%) enriched pathways were annotated as complement pathways. Traditionally, complement is considered a mostly hepatocyte-derived and serum-effective system (4). Thus, the dominance of the SARS-CoV-2–induced lung cell–intrinsic complement signature was unexpected.

Fig. 1 SARS-CoV-2 infection activated complement transcription in lung epithelial cells.

(A and B) Significantly enriched pathways by GSEA comparing transcriptomes of lung samples from patients infected with SARS-CoV-2 (n = 2) versus uninfected controls (A) and similar GSEA analyses on NHBE cells infected in vitro, or not, with SARS-CoV-2 (n = 3) (B). (C) GSEA of A549 cells transduced with ACE2 (A549-ACE2) or not (A549), comparing cells infected with SARS-CoV-2 versus control cells (n = 3 or 4). Pathways in (A) to (C) were ranked by significance (FDR q values), with complement pathways highlighted in red. Only enriched pathways with FDR of <0.25 are shown. (D and E) Comparison of all pathways significantly induced (FDR q value of <0.25) by SARS-CoV-2 in patients (A), NHBE cells (B), and A549 and A549-ACE2 cells (C), indicating 14 shared enriched pathways (D) and their normalized enrichment score (NES) displayed as a heatmap, with complement pathways highlighted in red (E). (F and G) Representative GSEA plot for one of the complement pathways in (E) and expression of the leading-edge genes from this pathway, with C3, C1R, C1S, and CFB highlighted in red (G). (H) Expression of CFB (top) and C3 (bottom) mRNA in control (Ctrl.) versus SARS-CoV-2–infected cells. (I) Spearman correlation between C3 mRNA expression and SARS-CoV-2 viral load across virus-bearing samples in (A) to (H). ppm, parts per million mapped reads. Data have been sourced from GSE147507. *P < 0.05, **P < 0.01, and ***P < 0.001, by ANOVA.

Because the patient lung biopsy samples contained a mixed population of lung cells, we next defined the cellular source of the complement signature in the affected lungs. To this end, we examined the transcriptomes of primary normal human bronchial epithelial (NHBE) cells infected in vitro with SARS-CoV-2, which again identified several complement pathways as highly enriched in infected cells. Hierarchical classification of enriched pathways by significance [false discovery rate (FDR) q value] showed that complement pathways were among the most highly enriched of all pathways after SARS-CoV-2 infection (Fig. 1B). One of the cell types infected by SARS-CoV-2 is type II pneumocytes, which are high expressors of angiotensin-converting enzyme 2 (ACE2), the best-characterized entry receptor for the virus (15). We, therefore, examined the transcriptomes of A549 cells, which have properties of type II human pneumocytes (16, 17), infected with SARS-CoV-2 and A549 cells first transduced to express high levels of ACE2. Complement pathways were among the most highly enriched, one of which was the most significantly induced pathway in ACE2-transduced A549 cells (Fig. 1C) (13). This response was much more pronounced for SARS-CoV-2, because analysis of RNA-seq of influenza A–infected NHBE or influenza A– or Rous sarcoma virus (RSV)–infected A549 cells did not induce such marked pathway enrichment (fig. S1, A and B, and table S1B), although viral loads in infected samples were comparable (fig. S1C).

To further pinpoint common modes of function, we compared all SARS-CoV-2–induced pathways among the four sample types infected with this virus: patient lung biopsies and NHBE, A549, and A549-ACE2 cells. Of the 14 pathways that were significantly induced by SARS-CoV-2 in all sample types, 4 of them were complement-related (Fig. 1, D and E). The other shared pathways predominantly included antiviral responses, especially type I IFNs (Fig. 1E). Taking the Kyoto Encyclopedia of Genes and Genomes (KEGG) complement and coagulation pathway, we noted that genes whose transcription was most highly induced by SARS-CoV-2 were encoding components of the C1 proteases C1R and C1S, CFB, and complement C3 (Fig. 1, F to H, and fig. S1D). C1 proteases are initiators of the classical pathway of complement activation, CFB is essential for the formation of alternative pathway C3 convertase (C3bBb) that activates C3, and C3 is the fundamental rate-limiting substrate for both (4). These data were supported by apparent dose dependency between SARS-CoV-2 viral loads in infected samples and C3 expression (Fig. 1I). To further test our conclusions, we analyzed transcriptomes of human bronchial organoids (hBOs) infected with SARS-CoV-2. These also showed that genes more highly expressed in infected hBOs were enriched in complement genes, including C3 and CFB (fig. S2, A and B). A recent single-cell RNA-seq study of human bronchial epithelial cells infected, or not, with SARS-CoV-2 in air-liquid interface cultures has identified eight cell types, four of which are actively infected by the virus (basal cells, basal/club intermediate cells, club cells, and ciliated cells) (18). Using these data, we carried out GSEA on the ranked list of differentially expressed genes, provided by the authors, in each cell type and looked for the enrichment of hallmark gene sets curated by MSigDB. We found that hallmark complement pathway genes were enriched in only the four cell types infected with SARS-CoV-2 but in none of the uninfected cells (fig. S2C). The complement pathway was particularly induced in club cells (fig. S2C) (19). C3 was one of the most significantly enriched genes within the leading edge (fig. S2, D and E). Proteomic analysis of mass spectrometry data from A549-ACE2 cells infected, or not, with SARS-CoV-2 confirmed the increased production of C3 protein after infection (fig. S3), consistent with the observed transcriptomic data. Collectively, these data suggested that the complement system was one of the top pathways activated by SARS-CoV-2 in lung epithelial cells.

C3 protein was processed to active forms in SARS-CoV-2–infected cells

C3 is the rate-limiting step of distal complement component activation (C5 to C9), and its own processing generates the biologically active fragments C3a and C3b (4). C3 is activated through an enzyme complex called C3 convertase (4). Viral induction of both C3 and CFB within epithelial cells (Fig. 1, F to H) suggested the presence of an intracellular C3 convertase capable of processing C3 to its active fragments. To determine whether C3 protein is activated to the C3a fragment in infected cells, we infected Calu-3 cell lines and primary human induced pluripotent stem cell–derived alveolar epithelial type 2 cells (iAEC2s) with SARS-CoV-2 and measured C3a by confocal imaging. In both Calu-3 and iAEC2s, we observed minimal C3a in mock-infected cells (Fig. 2, A to D). In cultures treated with SARS-CoV-2, infected cells had significantly increased intracellular C3a compared with both uninfected and mock-infected cells (Fig. 2, A to D). There was a direct linear correlation between SARS-CoV-2 N-protein expression and C3a levels in both cell types (Fig. 2, E and F), indicating a relationship between C3 activation in lung epithelial cells and viral load. Collectively, these data showed that respiratory epithelial cells were a source of complement C3 and its active products.

Fig. 2 SARS-CoV-2 infection generated C3a protein in lung epithelial cells.

(A to D) Confocal images (A and C) and quantification (B and D) from n = 2 independent experiments showing expression of C3a and SARS-CoV-2 N-protein in SARS-CoV-2–treated or mock-infected Calu-3 cells (A and B) or iAEC2s (C and D). Scale bars in (A) and (C), 100 μm. Cell numbers are indicated below each violin, and median values are denoted by dots in (B) and (D). (E and F) Correlation between SARS-CoV-2 N-protein intensity and C3a intensity on a per-cell basis in Calu-3 cells (E) and iAEC2s (F). Indicated are Pearson correlation coefficients and associated P values. Infected and uninfected cells in (B) to (D) have been distinguished by red and blue fills, respectively. ****P < 0.0001 by ANOVA. MOI, multiplicity of infection.

SARS-CoV-2 infection invoked distinct complement signatures across immune and epithelial cells in patients

To obtain insight into the interactions between SARS-CoV-2 and the complement system in vivo, we analyzed publicly available single-cell RNA-seq data from patients with COVID-19 (20). Bronchoalveolar lavage (BAL) samples from patients with mild (n = 3) and severe (n = 3) COVID-19 were compared with lung biopsy samples from uninfected individuals (n = 8). Clustering across all cells revealed three major cell types of myeloid, lymphoid, and epithelial origin, with seven apparent sub-cell types (Fig. 3, A and B). We distinguished alveolar type I (AT1) and type II pneumocytes (AT2) (Fig. 3, A and B). We found that expression of C3 was highest in AT2 cells (Fig. 3C and left and top of fig. S4, A and B), which have high expression of ACE2 and are major targets of primary SARS-CoV-2 infection (15). Although absolute cellularity was different between patients and uninfected subjects because of differences in tissue source (lung biopsy versus BAL, respectively), this difference had minimal effect on our observations. Expression of C3 was significantly higher in AT2 cells of patients with COVID-19 than those of uninfected donors (Fig. 3C), indicating that coronavirus infection of these cells induces C3 gene transcription in vivo. In an independent (bulk) RNA-seq dataset of bronchoalveolar fluid cells from patients with COVID-19 (n = 8) and uninfected controls (n = 20), we found similar enrichment of complement genes in cells from patients, including significantly higher expression of C3 and CFB (fig. S5, A and B). Likewise, lung samples collected at autopsy from patients with COVID-19 showed a positive linear relationship between SARS-CoV-2 viral loads and C3 mRNA expression (fig. S6), consistent with the dose dependency observed between these two parameters on experimental SARS-CoV-2 infection (Fig. 2, E and F).

Fig. 3 SARS-CoV-2 infection invoked distinct complement signatures across lymphoid, myeloid, and epithelial cells in patients.

(A) UMAP showing three major cell types and seven sub-cell types in uninfected subject lung biopsies (n = 8) and COVID-19 BAL specimens from patients with mild (n = 3) and severe (n = 3) COVID-19. (B) Expression of cell-defining features across all cell types. (C) Expression of C3, C3AR1, and CD46 in select cell types across uninfected, mild, and severe COVID-19 samples (see also fig. S4, A and B, for all cell types). (D and E) UMAP projection (D) and module (Mod) score (54) (E) of CD46-regulated genes (top), C3aR1-regulated genes (middle), and IFN-α/β–regulated genes (see table S2). In (E), selected cell types are shown. Single-cell data are from GSE145926 and GSE122960. ***P < 0.001 and ****P < 0.0001 by Wilcoxon test.

The biologically active components of C3, C3a and C3b, bind their cognate receptors C3aR and CD46, respectively, on leukocyte subsets to activate these cells and drive inflammation. As expected, we saw high expression of C3AR1, the gene encoding C3aR protein, on myeloid cells (Fig. 3C, middle, and middle of fig. S4, A and B) and CD46 on lymphoid cells (Fig. 3C, right, and bottom of fig. S4, A and B). To determine whether C3 within lung tissues is biologically active, we looked for the signature of genes regulated by C3aR and CD46. We curated a list of genes regulated by CD46 in lymphoid cells (table S2). Expression of CD46-regulated genes was significantly higher in lung lymphoid cells of patients and was higher in more severe cases (Fig. 3, D and E, top). In an independent bulk RNA-seq dataset of bronchoalveolar fluid cells from patients with COVID-19, we found similar enrichment of CD46-regulated genes in patient cells compared with uninfected controls (fig. S7). Similarly, we compiled a list of genes regulated by C3aR in myeloid cells (table S2). C3aR-regulated genes were significantly more expressed in monocyte/macrophage cells of patients compared with those from uninfected individuals (Fig. 3, D and E, middle). We also analyzed single-cell RNA-seq of circulating immune cells within peripheral blood mononuclear cells (PBMCs) of patients and healthy controls. In contrast to what we had seen in the lungs, C3 was minimally expressed by circulating immune cells, and the signatures of CD46 and C3aR activation were absent in patients’ cells (fig. S8, A to C). Collectively, these data indicated that C3 was produced locally in the lungs of patients with COVID-19 and processed to active fragments that acted on their cognate receptors to drive inflammation.

Complement gene transcription in lung epithelial cells was JAK1/2-STAT1–dependent

We next evaluated whether type I IFN responses played a role in complement activation, because IFNs were a common pathway activated by SARS-CoV-2 in respiratory epithelial cells (Fig. 1E). We sourced IFN-α/β signaling genes from Reactome (R-HSA-909733) (table S2). Genes regulated by type I IFNs were elevated in most cells from patients compared with uninfected controls including in AT1 and AT2 cells (Fig. 3, D and E, bottom). CD46, C3aR, and IFN-α/β signaling genes appeared to closely track with disease severity in lymphoid, myeloid, and pneumocyte (AT1 and AT2) cells, respectively. The correlation between both IFN and C3 in epithelial cells led us to explore the possibility that there may be a causal relationship between the two and mediated by transcription factors (TFs) driven by IFNs. We assessed the genes differentially regulated by SARS-CoV-2 in primary NHBE cells and the type II pneumocyte–like (A549) cell line. SARS-CoV-2 induced 223 and 108 and repressed 178 and 40 genes in NHBE and A549 cells, respectively (Fig. 4A and table S3A). We used Ingenuity Pathway Analysis (IPA) to predict the transcriptional regulators of these genes. Of the top 10 TFs predicted, half were IFN pathway signaling proteins, including STAT1 (Fig. 4B), the JAK1/2-induced STAT that transduces signals downstream of the IFN-α receptor (20). Two of the other core TFs were nuclear factor κB (NF-κB) family TFs, including RELA (Fig. 4B), a major regulator of gene transcription in response to pathogen and inflammatory cytokines [e.g., tumor necrosis factor and interleukin-1β (IL-1β)] (21). To validate whether STAT1 directly regulated complement genes, we analyzed publicly available chromatin immunoprecipitation sequencing (ChIP-seq) datasets of STAT1 and histone 3 lysine 27 acetylation (H3K27Ac; a marker of active and open chromatin regions) curated by ENCODE, as well as a RELA ChIP-seq dataset from GSE132018. Genes regulated by SARS-CoV-2 showed significant enrichment for both STAT1 and RELA binding (Fig. 4C). Both TFs bound open chromatin regions (H3K27 acetylated) of genes induced by SARS-CoV-2 in COVID-19 patient lung tissue and in NHBE cells (fig. S9, A and B, and table S3B). This indicated that STAT1 and RELA were strongly binding to the promoter regions of C3, CFB, C1S, C1R, IRF9, IRF7, and IL6, suggesting a potential role in their regulation (Fig. 4D and fig. S9C). Together, these data provided strong evidence that genes encoding complement components were regulated by STAT1 and RELA.

Fig. 4 STAT1 and RELA bound to complement genes induced by SARS-CoV-2.

(A) Numbers of differentially expressed genes in NHBE cells and A549 alveolar cell lines infected with SARS-CoV-2 in comparison with mock infection. (B) The top 10 IPA-predicted TFs regulating the SARS-CoV-2–driven transcriptional response in NHBE cells and human alveolar basal epithelial cell lines (A549). Highlighted in red are TFs transducing IFN-mediated gene transcription and in blue NF-κB–mediated gene transcription. (C) H3K27Ac, STAT1, and RELA ChIP-seq binding profiles across SARS-CoV-2–induced and repressed genes. (D) STAT1, RELA, and H3K27Ac ChIP-seq tracks showing the IRF9, CFB, and C3 gene loci. Data in (A) are from GSE147507, and data in (C) and (D) have been sourced from ENCODE (H3K27Ac and STAT1) and from GSE132018 (RELA). RELA profiles in (C) are from LPS-treated cells. ***P < 0.001; ****P < 0.0001 by Fisher’s exact test.

JAK/STAT inhibitors were predicted to normalize SARS-CoV-2–driven complement gene transcription

In parallel, we carried out pharmaceutical drug prediction. To this end, we compared the targets of 1657 curated drugs in the drug signatures database (DSigDB) (22) with the genes induced by SARS-CoV-2 infection. In both primary NHBE and A549 cells, ruxolitinib, a JAK1/2 inhibitor (JAKi) that blocks STAT1 signaling (23), was predicted to be the top candidate for normalization of the SARS-CoV-2 gene signature (Fig. 5, A and B, and table S4A), consistent with the enrichment of STAT1 binding in genes regulated by this virus (Fig. 4, C and D).

Fig. 5 The JAKi ruxolitinib neutralized SARS-CoV-2–mediated complement transcription.

(A) GSEA showing enrichment of genes normalized by pharmaceutical agents in the transcriptomes of control (Ctrl.) or SARS-CoV-2–infected NHBE (left) or A549 (right) cells. Drugs have been ranked by significance (FDR q values), with ruxolitinib, baricitinib, and atovaquone highlighted in red. (B) Representative GSEA plot showing enrichment (higher expression) of ruxolitinib–down-regulated genes in SARS-CoV-2–treated cells. (C) Heatmap showing expression of genes induced/repressed by SARS-CoV-2 in A549 cells transduced with ACE2 (A549-ACE2) and then infected with SARS-CoV-2 in the presence of ruxolitinib or vehicle. Genes are clustered according to their response to SARS-CoV-2 and ruxolitinib. (D) Scatterplot comparing the expression of all genes between STAT1 wild-type (STAT1+/+) and STAT1 knockout (STAT1−/−) HepG2 cells after IFN-α treatment. Differentially expressed genes (FC > 2) are highlighted in blue (down-regulated in knockout) and red (up-regulated in knockout), and selected key complement and IFN pathway genes are highlighted in orange. IL6 is also marked but not significantly expressed or changed. Transcriptomes are sourced from GSE147507 (A to C) (13) and GSE98372 (D) (25).

We analyzed the effects of ruxolitinib on the SARS-CoV-2–induced transcriptome by comparing RNA-seq from A549 cells transduced to express ACE2 and then infected with SARS-CoV-2 in the presence of ruxolitinib or vehicle (Fig. 5C). IL6, IRF7, and IRF9 and all of the complement components we had previously observed—namely C1R, C1S, CFB, and C3—were almost completely normalized by ruxolitinib (Fig. 5C and table S4B).

JAKi can have off-target effects (24), and STAT3 can theoretically be activated by the IL-6 produced in response to SARS-CoV-2; thus, we confirmed whether complement was regulated by STAT1 in a STAT1-deficient cell line. We analyzed publicly available transcriptomes of STAT1 wild-type (STAT1+/+) and STAT1 knockout (STAT1−/−) HepG2 liver cells treated, or not, with IFN-α (25). Treatment with IFN-α, an archetypal STAT1-activating type I IFN, is required to induce STAT1 signaling and its nuclear translocation (table S4C). IFN-α only induced the previously identified components of the complement system—C3, C1R, C1S, CFB, and IRF7 and IRF9—in the presence of replete STAT1 status (Fig. 5D). Moreover, most SARS-CoV-2–induced genes normalized by ruxolitinib were down-regulated in STAT1−/− cells and did not respond to IFN-α treatment (fig. S10A). These data indicated that STAT1 was indispensable for inducing these genes. In addition, IL6 transcription was also not induced by IFN-α treatment in these cells, irrespective of STAT1 status (Fig. 5D), so we concluded that STAT1, not STAT3, was the dominant driver of complement gene regulation. To address the concern that JAK-STAT inhibition could impair antiviral immunity and enhance viral replication (26), we quantified SARS-CoV-2 viral loads in these samples by aligning raw reads to the viral genome. Ruxolitinib treatment did not alter viral loads in any of these samples (fig. S10B).

SARS-CoV-2–driven C3 activation could be normalized by CFB or JAK-STAT1 inhibition

We identified CFB as one of the most highly induced complement genes in response to SARS-CoV-2 (Fig. 1, F to H), which suggested the potential synthesis of an inducible cell-intrinsic C3 convertase in infected cells. To test this possibility, we first investigated the ability of a novel cell-permeable inhibitor of CFB (CFBi) to reduce virus-induced C3a production in SARS-CoV-2–infected iAEC2s. This inhibitor specifically targets CFB (Fig. 6A and table S5), rapidly diffuses into cells (fig. S11A), and blocks complement activation induced by zymosan, a strong alternative complement pathway complement activator (Fig. 6B), without inducing cell death (fig. S11B). Addition of this CFBi to cultures markedly reduced C3a generation in response to SARS-CoV-2 infection (Fig. 6C), confirming that C3 processing in response to virus occurs via a cell-intrinsic C3 convertase in respiratory epithelial cells.

Fig. 6 Pharmacological inhibition of key targets inhibited C3a output from SARS-CoV-2–infected respiratory epithelial cells.

(A) Chemoproteomic profiling of CFBi identified complement factor as the only target. Shown is the dose-dependent reduction of bead binding of CFB from protein extracts of cells. Shown are means and SD from three independent experiments. (B) C3a enzyme-linked immunosorbent assay in plasma treated with zymosan (an alternative complement pathway activator) in the presence of increasing concentrations of EDTA (a chelator of divalent cations, which stops convertase activity), a CFB blocking antibody (Ab) or isotype control, the chemical CFBi, or its carrier, DMSO. Bars show means + SEM; dots represent individual experiments. (C) Confocal images (left) and quantifications (right) showing generation of C3a in mock-infected or SARS-CoV-2–infected iAEC2s treated with CFBi, ruxolitinib, or a combination of ruxolitinib and remdesivir. Scale bar, 100 μm. Data are from n = 2 independent experiments; 18,191 + 660 (means + SD) cells per condition. Bars indicate means + SD (A) or SEM (B and C). *P < 0.05, ***P < 0.001, and ****P < 0.0001 by ANOVA. AU, arbitrary units.

We also tested the ability of ruxolitinib, alone or in combination with remdesivir, to block C3a production in cells. Ruxolitinib significantly inhibited C3a generation, and this effect was further inhibited by the presence of remdesivir (Fig. 6C), consistent with the transcriptional data (Fig. 5C). None of the drugs reduced SARS-CoV-2 N-protein expression, but they did reduce syncytia formation (fig. S12, A and B), which may indicate that complement may also influence the biology of the virus in cells. Together, these data confirmed that SARS-CoV-2 induced C3 transcription via JAK-STAT1 signaling and that C3 was activated via an alternative pathway convertase that was intrinsic and inducible in infected cells. Blockade of JAK1/2-STAT1 signaling, particularly in combination with an antiviral, normalized the production of C3a from infected cells.


Here, we showed that the induction of complement expression and C3 protein activation in airway epithelial cells is a SARS-CoV-2–driven event and not a bystander event triggered by overt cytokine production/inflammation. C3 was induced in response to SARS-CoV-2 in infected epithelial cells in a JAK/STAT-dependent manner and then processed to biologically active C3a by CFB. This could be normalized by pharmaceutical blockade of the relevant pathways (Fig. 7).

Fig. 7 Schematic model of SARS-CoV-2 induction of complement in respiratory epithelial cells.

SARS-CoV-2 infects respiratory epithelial cells and induces an IFN response. IFNs signal via the IFN receptor to activate STAT1 via JAK1/2. STAT1 cooperates with RELA to induce transcription of IL-6 and complement genes including C3, CFB, C1R, and C1S. CFB acts as an alternative pathway C3 convertase to cleave C3 intracellularly to C3a and C3b. C3a engages C3aR and C3b engages CD46 on leukocyte subsets in the lungs to drive inflammation. These events can be pharmacologically targeted with antivirals (e.g., remdesivir), JAK-STAT inhibitors (e.g., ruxolitinib), and/or cell-permeable complement inhibitors, including CFBi.

Complement activity is usually protective during viral infections and required to control pathogens (27). However, excessive activation of complement contributes to ARDS caused by a number of different etiologies (28, 29) and is a mediator of acute lung injury driven by pandemic respiratory viruses (30, 31). Because the observed gene signatures in SARS-CoV-2 were inflammatory, we conclude that C3 ligation of its receptors on tissue-resident/infiltrating leukocytes during SARS-CoV-2 infection is pathogenic. This may be a mechanism common to pandemic coronaviruses, because mouse models of SARS-CoV-1 infection, a related virus of the same family, have indicated that C3, C1r, and Cfb are all part of a pathogenic gene signature correlating with lethality (32) and that global C3−/− status is protective (33). Patients with severe COVID-19 have high circulating levels of terminal activation fragments of complement (C5a and sC5b-9) (810), which correlate with the clinical severity of disease (8). C3 hyperactivation is an independent risk factor for mortality from COVID-19 (12). In support of these observations, single-nucleotide variants in genes encoding two complement regulators, decay accelerating factor (CD55) and complement factor H (CFH), are risk factors for morbidity and mortality from SARS-CoV-2 (11). In addition, encouraging outcomes using inhibitors clinically point to a pathogenic role for complement in severe COVID-19, although such small case series should be interpreted with caution. The C3 inhibitor AMY-101 has been used in 4 patients with COVID-19, who recovered (34, 35); the C5 activation inhibitor, eculizumab, was used as adjunctive treatment in 14 patients, 12 of whom recovered (35, 36); another anti-C5a antibody (BDB-001) was used in 2 patients with critical COVID-19, who recovered (37). Moreover, avdoralimab, a human Fc-silent monoclonal antibody against C5aR1, inhibited production of inflammatory cytokines, induced by either C5a or single-strand RNA virus–like stimuli, by monocytes of patients with COVID-19 (8). Nonetheless, the mechanism that converts the protective complement system into a harmful one during COVID-19 is currently unclear but may be rooted in the overwhelming combined local and systemic complement induced by the virus. Our findings indicated that infection of respiratory epithelial cells is a potent inducer of active complement within the lungs. This is important because serum-derived complement was absent here, so locally produced complement could represent the major source. Corroborating sustained local activation of this system, a published clinical study recently reported the presence of both proximal (C4d) and distal complement fragments (C5b-9) in lung tissues (38).

Our data indicated that IFN-induced STAT1 is the dominant regulator of local complement production from respiratory epithelial cells and that the JAKi ruxolitinib neutralizes SARS-CoV-2–mediated complement activation but does not fully normalize the transcriptome. Little is known about the regulation of local complement expression; thus, these data have relevance beyond SARS-CoV-2 infection. Potential connections between complement activity and type I IFN responses during pathogen encounter are currently an unexplored field aside from a study that used full C3-deficient animals and noted the suppressive effect of C3 on IFN after exposure of animal to plant virus–like nanoparticles (39). There is a proposed link between IFNs and complement in the context of transplant thrombotic microangiopathy (40). Our data suggest that the use of a JAKi to normalize all of the proximal genes induced by SARS-CoV-2 could represent a more refined approach than targeting a single complement component (e.g., C3 or C5a) with inhibitors that only work in the extracellular space. Complement activation also occurs in the intracellular space, where it performs functions critical to mounting an effective inflammatory immune response (Fig. 7) (6, 7). Moreover, interfering with type I IFN signaling can redirect immunity to enable control of viral infection (41). The dual nature of type I IFNs in SARS-CoV-2–induced inflammation has recently been elegantly reviewed by others (42). The observation that these drugs also reduced syncytia formation is intriguing because they suggest that complement may also influence the biology of the virus in cells; CD46, the receptor for C3b, the other major product of C3 processing, is known to enhance syncytia formation in other viral infections (43), so it is possible that reducing active C3 also reduces syncytia formation.

The prediction that the NF-κB pathway is also a regulator of the genes induced by SARS-CoV-2 and the failure of ruxolitinib to normalize a large subset of genes suggest that monotherapy with ruxolitinib may be insufficient for the management of COVID-19. While there are ongoing clinical trials with JAKi for treating patients with COVID-19, our analyses suggest that combining a JAKi, such as ruxolitinib, with, for example, antiviral agents (e.g., remdesivir), may be a more promising therapeutic approach than monotherapy alone. Because of concerns regarding the use of JAKi in a disease with a propensity to thrombosis (1, 44, 45), combining a JAKi with a second agent may permit use of lower doses of both drugs, potentially reducing thrombotic adverse effects and reducing risks of viral replication. With regard to concerns that JAKi can increase viral replication, severe COVID-19 is characterized by hyperinflammation, and viral loads are actually low at this point, so reducing inflammation is of primary importance; thus, potentially increasing viral load with a JAKi is a lesser concern than in milder or the initial stages of disease. In conclusion, transcription and activation of pathogenic complement can be normalized in epithelial cells using drugs that target JAK1, CFB, or the virus itself, and combinations of these may be therapeutically beneficial in treating COVID-19.


Study design

The objective of this study was to investigate host-virus interactions in SARS-CoV-2–infected cells. We analyzed bulk and single-cell RNA-seq data from the blood and lung tissues of patients with COVID-19 and compared them with healthy controls. To verify activation of complement in situ within cells, we infected iAEC2 and Calu-3 cells with SARS-CoV-2 and carried out confocal microscopy for C3a. We performed computational drug prediction to identify those capable of normalizing gene signatures induced by SARS-CoV-2. To verify our top predictions in a culture system, we infected iAEC2s with SARS-CoV-2 with and without the addition of the respective cultures to the culture medium.

Cell culture and viral infections

Human adenocarcinoma lung epithelial (Calu-3) cells (American Type Culture Collection, HTB-55) were cultured in Dulbecco’s modified Eagle’s medium (DMEM; Gibco) supplemented with 10% fetal bovine serum (FBS; Corning), Hepes, nonessential amino acids, l-glutamine, and 1× antibiotic-antimycotic solution (Gibco). The iPSC (SPC2 iPSC line, clone SPC2-ST-B2, Boston University)–derived alveolar epithelial type 2 cells (iAEC2s) were differentiated as previously described and maintained as alveolospheres embedded in three-dimensional Matrigel in “CK + DCI” media, as previously described (46). iAEC2s were passaged about every 2 weeks by dissociation into single cells via the sequential application of dispase (2 mg/ml; Thermo Fisher Scientific, 17105-04) and 0.05% trypsin (Invitrogen, 25300054) and replated at a density of 400 cells/μl of Matrigel (Corning, 356231), as previously described (46). All cells were maintained at 37°C and 5% CO2.

SARS-CoV-2, isolate USA-WA1/2020 (NR-52281), was obtained from BEI Resources and was propagated in Vero E6 cells in DMEM supplemented with 2% FBS, d-glucose (4.5 g/liter), 4 mM l-glutamine, 10 mM nonessential amino acids, 1 mM sodium pyruvate, and 10 mM Hepes. Infectious titers of SARS-CoV-2 were determined using median tissue culture infectious dose method (47). The mock “virus” was prepared similarly using supernatant of Vero E6 cells. Ten thousand Calu-3 and iAECs per well were seeded in a 384-well plate (PerkinElmer, 6057300) and allowed to form 80% confluent monolayer. SARS-CoV-2 virus was pretreated with porcine trypsin (10 μg/ml) for 15 min at 37°. Cells were then infected or mock-infected with pretreated virus prep at a multiplicity of infection of 2 for Calu-3 and 1 for iAECs for 1 hour in culture media (final concentration of trypsin on cells was 2 μg/ml). After absorption, virus inoculum was removed and replaced with fresh culture media. In the experiments with compound treatment conditions, virus inoculum was replaced with media containing a cell-permeable CFBi (2 μM; GlaxoSmithKline), ruxolitinib (1 μM; Cayman Chemical Company, catalog no. 11609), or a combination of ruxolitinib and remdesivir (250 nM; Cayman Chemical Company, catalog no. 30354). All experiments using SARS-CoV-2 were performed at the University of Michigan under Biosafety Level 3 (BSL3) protocols in compliance with containment procedures in laboratories approved for use by the University of Michigan Institutional Biosafety Committee and Environment, Health, and Safety.

Confocal imaging and analysis

Two days after infection, mock-infected or SARS-CoV-2–infected cells were fixed with 4% PFA (paraformaldehyde) for 30 min at room temperature, permeabilized with 0.3% Triton X-100 for 15 min, and blocked with antibody buffer [1.5% bovine serum albumin (BSA), 1% goat serum, and 0.0025% Tween 20]. The plates were then sealed, surface decontaminated, and transferred to BSL2 for staining. To detect virally infected cells, anti-nucleocapsid protein (anti-N) SARS-CoV-2 antibody (Sino Biological, catalog no. 40143-R019) was used as a primary antibody with an overnight staining at 4°C followed by staining with secondary antibody Alexa Fluor 647 (goat anti-rabbit, Thermo Fisher Scientific, A21245). To detect activated C3, C3a antibody was used with the same staining protocol as the viral marker by using an anti-human C3a neo-epitope (Abcam, catalog no. 2991) and secondary antibody Alexa Fluor 488 (goat anti-mouse, Thermo Fisher Scientific, A21121). Hoechst 33342 pentahydrate (bis-benzimide) was used for nuclei staining (Thermo Fisher Scientific, H1398).

Plates were subjected to confocal imaging using a Thermo Fisher Scientific CellInsight CX5 High-Content Screening Platform (Calu-3) or a Yokogawa CQ1 Benchtop High-Content Analysis System (iAECs). Images were analyzed using the open-source CellProfiler (3.1.9) software with a pipeline designed to segment cell nuclei, cytoplasm, and infected regions (either syncytia or singly infected cells) given staining for Hoechst 33342 and N protein. Intensity of C3a and N protein was measured within nuclei and cytoplasm on a per-cell level along with Pearson correlation of C3a intensities to that of N protein. Cells were considered syncytial infected if their nuclei were found present within a viral region. The distribution of N protein signal intensities in mock-infected cells was calculated, and cells above (means + 1.96 SD) signals were considered as infected. Visual inspection of the images confirmed the validity of this method in determining infection rate. The C3a signal intensities were then plotted as a function of infection or N protein signal intensities.

Chemoproteomic profiling of CFBi

The human biological samples were obtained with institutional ethics approvals, and their research use was in accord with the terms of the informed consents under an IRB/EC-approved protocol. To generate a bead matrix, an amine-functionalized analog of CFBi termed CFBi-F (produced by GSK) was immobilized on Sepharose beads. To distinguish between proteins binding to the immobilized compound and background, a quantitative competition-based approach was applied. The test compound CFBi-F was spiked into aliquots of protein extract (mixed human embryonic kidney 293, K-562, placenta, and HepG2 cell/tissue extract) over a range of concentrations starting at 20 μM in 1:6 dilutions. It then competed with the immobilized analog for binding to the target protein/s. Matrix-bound proteins were eluted, trypsinized, and subsequently encoded with isobaric mass tags (TMT10) enabling relative quantification by liquid chromatography (LC)–tandem mass spectrometry. Only the captured target protein of the test compound, CFB, was dose-dependently reduced from bead binding, thus enabling the determination of concentrations of half-maximal binding [median inhibitory concentration (IC50)]. Apparent dissociation constants (Kdapp) were derived from the IC50 values by taking into account the amount of target sequestered by the affinity matrix using the Cheng-Prusoff relationship (IC50/Kdapp correction factor) and sequential binding experiments. IC50 values for CFB competition were averaged, and SD was calculated. Values average of the three independent experiments are indicated in the figure.

Serum complement alternative pathway assay to measure activity of CFBi

The biotinylated C3a (clone 2991, Hycult HM2074BT-B) capture antibody was diluted in MSD Diluent 100 (catalog number R50AA-2) and was added to MSD GOLD 96-Well Streptavidin SECTOR plate (catalog number L15SA-1). After 1 hour of incubation, the plates were washed three times in phosphate-buffered saline (PBS) with 0.05% Tween 20 and blocked with blocking solution (2% BSA in PBS + 0.05% Tween) overnight at 4°C. The next day, normal human plasma was diluted to 6% in alternative pathway buffer (7.5 mM Hepes, 150 mM NaCl, 7 mM MgCl2, and 10 mM EGTA) and activated with zymosan (1 mg/ml). Small-molecule factor B inhibitor (CFBi) and factor B blocking antibody with appropriate negative [dimethyl sulfoxide (DMSO), vehicle for CFBi and immunoglobulin G1 isotype antibody] and positive control (EDTA) were added to wells of the 96-well U-bottom plate and incubated for 1.5 hours at 27°C with shaking at 750 rpm. The reaction was stopped with 5 mM EDTA, and 50 μl of aliquot was transferred to the anti-C3a–coated MSD plate. A C3a standard curve was constructed using purified human C3a (ComTech, A118) in concentrations ranging from 10 to 0.00977 nM. After 1.5 hours of incubation at 27°C with shaking, the plates were washed in PBS/0.05% Tween 20, and the detection ruthenylated anti-C3a antibody (clone 474, made at GSK) was added. After three washes, plates were developed with MSD Read buffer, and electroluminescence was recorded on an MSD Sector 6000 plate reader.

Determination of cell viability using flow cytometry

CD4+ T cells were isolated from drawn human whole blood. For their isolation, STEMCELL CD4+ T cell negative selection kit (#17952) was used according to the manufacturer’s provided protocol. Cells were treated with 10 μM final concentration of cell-permeable CFBi or equivalent volume of vehicle control (DMSO) and activated in 96-well flat-bottom plates (Greiner, #655083) coated with anti-CD3 antibody (1 μg/ml; clone OKT, BioLegend, #317347) and anti-CD28 antibody (2 μg/ml; Thermo Fisher Scientific, #16-0289-85). After 48 hours in culture, cells were transferred into a 96-well V-bottom plate and analyzed for FSC and SSC properties on BD FACSCanto. Data shown are percentages of viable cells in three healthy donors.

Determination of intercellular CFBi compound concentrations

Sample preparation

Human CD4+ T cells were isolated from peripheral blood as described above. Human CD14+ monocytes were isolated using a CD14-negative isolation kit (STEMCELL, no. 17858) as per the manufacturer’s protocol. Cells were then washed twice in ice-cold sterile PBS and adjusted for concentration of 1 × 106/ml in RPMI 1640 supplemented with 10% fetal calf serum. A total of 250,000 cells per well from both CD4+ and CD14+ populations were added to 48-well plates (Greiner, #677180), and 10 μM CFBi or equivalent amount of DMSO was added to the wells. CD4+ T cells were activated with anti-CD3/anti-CD28 antibodies as described above while CD14+ monocytes were activated with lipopolysaccharide (LPS; 100 ng/ml). Cells were harvested via centrifugation at 350g at 60 and 120 min after activation, and supernatant was removed. The pellets were snap frozen at −80°C until used.

Standard curve preparation

Standard curves of 0.1 to 10,000 ng/ml over 16 points (0.1, 0.2, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, 2000, 5000, and 10,000 ng/ml) were constructed in relevant matrices, and these, along with the samples (25 ml), were quenched with 300 ml of acetonitrile containing reserpine at 175 ng/ml as the internal standard. All samples were shaken for 20 min on a vortex mixer and then centrifuged for 15 min at 1600g. A 1.5-μl aliquot of the resulting supernatant was injected to the mass spectrometer for analysis.

High-performance LC mass spectrometry apparatus, conditions, and data interpretation

The high-performance LC (HPLC) system was an integrated Shimadzu modular HPLC system comprising of two LC-30AD binary pumps, SIL-30ACMP autosampler, CTO-20C column oven, and CBM20Alite controller (Shimadzu, Milton Keynes, Buckinghamshire, UK). The HPLC analytical column was a Kinetex EVO C18 2.5 u, 50 mm by 2.1 mm (Phenomenex Ltd., Macclesfield, Cheshire, UK), maintained at 40°C. The mobile phase solvents were water containing 0.1% formic acid and acetonitrile containing 0.1% formic acid. A gradient ran from 5 to 95% ACN (acetonitrile) + 0.1% formic acid up to 1 min held for 0.1 min and returned to the starting conditions over 0.15 min then held to 1.7 min at a flow rate of 0.8 ml/min. Mass spectrometry detection was performed by using an API 4000 triple quadrupole instrument (AB Sciex, Warrington, Cheshire, UK) using multiple reaction monitoring (MRM). Ions were generated in positive ionization mode using an electrospray interface. The ionspray voltage was set at 4000 V, and the source temperature was set at 650°C. For collision dissociation, nitrogen was used as the collision gas. The MRM of the mass transitions for CFBi [mass/charge ratio (m/z), 487.17 to 308.10] and reserpine (m/z, 609.38 to 195.10) was used for data acquisition. Data were collected and analyzed using Analyst 1.4.2 (AB Sciex, Warrington, Cheshire, UK); for quantification, area ratios (between analyte/internal standard) were used to construct a standard line per analyte, and results were extrapolated from the area ratio of samples from these standard lines.

Gene set enrichment analysis

GSEA was performed using GSEA version 4.0.3 (14) with the parameters “Permutation type = gene_set” and “Collapse to gene symbols = No_Collapse.” All canonical pathways “c2.cp.v7.1” were used throughout the paper. Upstream transcriptional regulator identification was done using IPA on genes that were differentially expressed [fold change (FC) of 1.5, FDR < 0.05] in NHBE or A549 cells. GSEA in fig. S2 (C to E) was performed using preranked and “No Collapse” options.

Bulk RNA-seq data analysis

For GSE147507, the original raw read counts were normalized to obtain transcript per million (TPM) values that were then used for plotting the expression values and performing GSEA analyses. For SRP257667, the raw fastq files were downloaded, and mRNA expression levels were estimated by RSEM software (48) using “rsem-calculate-expression” with the parameters “--bowtie-n 1 –bowtie-m 100 –seed-length 28.” The RSEM-required bowtie index was created by “rsem-prepare-reference” on all RefSeq genes downloaded from the UCSC table browser on April 2017. The differentially expressed genes were identified using edgeR package (49) on the original raw read counts for GSE147507 and the expected read counts from rsem for SRP257667. FC (FC > 1.5) and FDR q value (P < 0.05) were used to identify differentially expressed genes. Viral RNA load (viral titer) was calculated by counting the fraction of all mapped sequencing reads aligned to the corresponding viral genome (RSV, M11486.1; IAV, NC_002023.1; SARS-CoV-2, NC_045512.2), indicated as parts per million reads of library size (50) in each sample.

ChIP-seq data analysis

H3K27Ac ChIP-seq in A549 cells (ENCFF137KNW), H3K27Ac ChIP-seq in primary lung cells (ENCFF055YQO and ENCFF677KZQ), and STAT1 ChIP-seq in HeLa cells (ENCFF000XLN) were obtained from ENCODE. RELA ChIP-seq in FaDu cells was from GSE132018. In all cases, the preprocessed and author-provided peak files (e.g., ENCFF565WST and ENCFF002CTG) were obtained, and the nearest transcription start sites and corresponding genes were identified by HOMER “annotatePeaks” program. The overlap between these genes and SARS-CoV-2–induced/repressed genes or all human genes was then assessed to determine enrichment. ChIP-seq tracks and heatmaps were visualized using IGV browser (Broad Institute) and deepTools (51), respectively.

Statistical analysis and data visualization

Analyses were performed using GraphPad Prism 8 (La Jolla, CA, USA) and Data Graph v4.5. All the individual data points are presented and compared using one-way analysis of variance (ANOVA) or Fisher’s exact test, as appropriate. P values of <0.05 are denoted as statistically significant throughout. The heatmaps were drawn using Morpheus software (Broad Institute). The schematic in Fig. 7 was created with

Drug prediction

Raw read counts for A549 and NHBE cells comparing the SARS-CoV-2–infected cells with controls were obtained from GSE147507 and normalized to obtain TPM values (see table S1A). Drugs with provided down-regulated target genes (between 10 and 1000) were obtained from DSigDB v1.0 (22). For ruxolitinib, the lists of all up- and down-regulated genes (P < 0.05) were obtained by comparing MCF-7 cells treated with ruxolitinib or vehicle control (data from GSE131300) using DESeq2 (see table S4D) (52). For baricitinib, the list of all up- and down-regulated genes (P < 0.0005) was obtained by comparing systemic baricitinib treatment versus control at 12 weeks (data from GSM1508095) using GEO2R (see table S4D) (53). The GSEA was performed using GSEA version 4.0.3 with the following parameters: Permutation type = gene_set, Collapse to gene symbols = No_Collapse, “Min Size = 10,” and “Max Size = 1000.” A549 and NHBE samples were treated as expression datasets, and the DSigDB data were treated as gene set database. All the rest of the parameters were kept as default. The data with FDR q value of <0.25 are reported (see table S4A).

Single-cell RNA-seq data analysis (BAL fluid)

The preprocessed h5 matrix files for six COVID-19 patient BAL samples and eight uninfected control lung biopsies were obtained from GSE145926 and GSE122960, respectively. Read mapping and basic filtering were performed with the Cell Ranger pipeline by the original authors. We further processed the samples using Seurat (version 3) as follows: Only genes found to be expressed in more than three cells were retained. Cells with >10% of their unique molecular identifiers (UMIs) mapping to mitochondrial genes or cells with <300 features were discarded to eliminate low-quality cells or nuclei. This yielded a total of 89,133 cells across 14 samples. The filtered count matrices were then normalized by total UMI counts, multiplied by 10,000, and transformed to natural log space. The top 2000 variable features were determined on the basis of the variance stabilizing transformation function (FindVariableFeatures) by Seurat with default parameters. All samples were integrated using canonical correlation analysis function with default parameters. Variants arising from library size and percentage of mitochondrial genes were regressed out by the ScaleData function in Seurat. Principal components analysis (PCA) was performed, and the top 30 principal components (PCs) were included in a Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction. Clusters were identified on a shared nearest neighbor (SNN) modularity graph using the top 30 PCs and the original Louvain algorithm. Cluster annotations were based on canonical marker genes. Gene list scores were calculated by AddModuleScore function in Seurat (54). Statistical differences of marker expressions and scores were assessed by a Wilcoxon test.

Single-cell RNA-seq data analysis (PBMC)

The preprocessed serialized R objects for COVID-19 patient PBMC samples (n = 6) and healthy control PBMC samples (n = 6) were obtained from GSE150728 (55). Read mapping and basic filtering were performed with the Cell Ranger pipeline (10x genomics) by the original authors. The exonic count matrices were further processed by Seurat (version 3) as follows: Only genes found to be expressed in more than 10 cells were retained. The QC steps for filtering the samples were performed as described (55). Briefly, cells with 1000 to 15,000 UMIs and <20% of reads from mitochondrial genes were retained. Cells with >20% of reads mapped to RNA18S5 or RNA28S5 and/or expressed more than 75 genes per 100 UMIs were excluded. SCTransform function was invoked to normalize the dataset and to identify variable genes as previously described (55). PCA was performed, and the top 50 PCs were included in a UMAP dimensionality reduction. Clusters were identified on an SNN modularity graph using the top 50 PCs and the original Louvain algorithm. Cluster annotations were based on canonical marker genes. Gene list module scores were calculated by the AddModuleScore function in Seurat with a control gene set size of 100.


This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Acknowledgments: We would like to acknowledge the NIH HPC (Biowulf) for efforts in maintaining essential bioinformatic programs. Funding: This research was financed by the National Heart, Lung, and Blood Institute of the NIH (grant 5K22HL125593 to M. Kazemian; R01HL119215 to J.R.S.); National Institute of General Medical Sciences of the NIH (grant R35GM138283 to M. Kazemian); and Deutsche Forschungsgemeinschaft (fellowship FR 3851/2-1 to T. Freiwald) and supported, in part, by the Intramural Research Program of the NIH; the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) (project number ZIA/DK075149 to B.A.); the National Heart, Lung, and Blood Institute (NHLBI) (project number ZIA/Hl006223 to C.K.); and the National Institute of Allergy and Infectious Diseases (NIAID) (project number ZIA/AI001175 to M.S.L.). T. Frum is supported by T32DE007057. Funding for part of the work was provided by the University of Michigan Biological Scholars Program (to C.E.W.), LifeARC Charity (to S.K.), and CRUK KHP Centre (to S.K.). Author contributions: B.Y., T. Freiwald, D.C., L.W., E.W., J.B., B.A., and M. Kazemian analyzed data and wrote the manuscript. C.M., C.J.Z., E.-M.N., N.M., R.G., M.B., S.G.-D., M. Kolev, T. Frum, J.R.S., J.Z.S., N.N., K.D.A., D.N.K., and J.B. performed experiments and analyzed data. M.R.O., S.K., D.P., A.L., M.S.L., C.K., S.P., C.J.Z., C.E.W., B.A., and M. Kazemian provided intellectual input and wrote the manuscript. C.K., B.A., and M. Kazemian conceived and supervised the work. Competing interests: E.-M.N., R.G., M.B., S.G.-D., and M. Kolev are current employees and shareholders of GSK Plc. Data and materials availability: All data contained in this paper are publicly available and referenced in table S6, together with all gene expression omnibus numbers for access as follows: GSE147507 (13), GSE150316 (56), HRA000143 (57), GSE150819, GSE98372 (25), GSE145926 (58), GSE122960 (59), GSE150728 (55), GSE132018 (60), ENCODE (ENCFF137KNW; ENCFF055YQO; ENCFF677KZQ; ENCFF000XLN), and PXD020019 (61). All other data needed to evaluate the conclusions in the paper are present in the paper or the Supplementary Materials. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.

Stay Connected to Science Immunology

Navigate This Article