In a recent study published in the journal Nature as an early release unedited manuscript, researchers conducted whole genomic sequencing (WGS) to ascertain factors contributing to critical coronavirus disease 2019 (COVID-19).
Severe disease in COVID-19 patients is caused by immunological pulmonary damage and genetic variations in the host. Genomic analysis could enable the development of genetically targeted vaccines against COVID-19.
The researchers of the present study had established genetic association with critical COVID-19 in their previous work using microarray genotyping for 2,244 patients. The current study was conducted to extend the earlier findings by including novel and rare variants.
Study: Whole genome sequencing reveals host factors underlying critical Covid-19. Image Credit: peterschreiber.media / Shutterstock
About the study
In this study, the team conducted WGS in 7,491 critical COVID-19 patients admitted to 224 intensive care units (ICUs) and 48,400 control individuals. WGS was performed to replicate and discover 23 viral strains that significantly influenced COVID-19 severity. Additionally, transcriptome-wide association studies (TWAS) and colocalization were performed to assess the impact of genomic expression on COVID-19 severity. TWAS was conducted using gene expression data (GTExv822) for whole blood and lung tissues, two tissues affected in critical COVID-19.
Genome-wide association studies (GWAS) were conducted to assess genetic ancestry, including South Asian, East Asian, European (EUR), African, and combinational groups. Based on the GWAS findings, the team deduced credible sets of viral strains with the help of Bayesian fine-mapping.
A meta-analysis of the statistics derived from the COVID-19 Host Genetics Initiative data freeze 6(B2) (HGIv6) and 23andMe, Inc. was performed for viral replication. Additionally, the team used LD clumping for finding viral strains genotyped in the replication and discovery studies. Due to replication failure of two genes, hospitalized COVID-19 patients’ data from AncestryDNA, UKB, Geisinger Health Systems, and Penn Medicine Biobank were used to perform a second GWAS analysis.
The team used HIBAG15 to investigate the contribution of HLA alleles in critical COVID-19. In addition, genomic SKAT (single nucleotide polymorphisms-sequence kernel association test) analysis and genetic burden assessment were performed to evaluate the effects of rare variants (minor allele frequency <0.5%) in severe COVID-19. As HLA has a complicated LD structure, they assessed colocalization for only HLA-DRB1, a significant association. They also performed generalized summary-data-based Mendelian randomization (GSMR) using the protein quantitative trait loci (pQTLs) from the INTERVAL study to examine the genomic effect on COVID-19 severity.
Results
Sixteen new independent associations, including variants in genes associated with interferon (IFN) signaling (PLSCR1, IL10RB), blood type antigen secretor status (FUT2), and leucocyte differentiation (BCL11A). Additionally, increased mucin expression (MUC1) and reduced the expression of a membrane flippase (ATP11A) in critical COVID-19.
The researchers found 22 distinct credible variant sets from the EUR ancestry and two by the trans-ancestry meta-analysis. Fine mapped association signals demonstrated the strongest signal at 5q31.1. Moreover, 50% of the signals exhibited credible sets comprising five or lesser variants. The credible sets included a missense variant in granulocyte-macrophage colony-stimulating factor 2 (GMSCF2) that has been strongly up-regulated in critical COVID-19 in previous studies.
Deleterious missense mutations were detected at signals, at 9p21.3 and 3q24, that impact IFNA10 and PLSCR1, respectively. Although signals from rare variant minor allele frequency (MAF) greater than 0.02% were also fine-mapped, no extra variants were included in the major credible sets. Age-stratification (below or above 65 years) demonstrated a signal at the 3p21.31 site in the EUR ancestry with a substantially stronger impact among the younger population, irrespective of their gender.
More than 20 significant associations detected in the GWAS analysis were replicated. The two non-replicative signals corresponded to the human leukocyte antigen (HLA) locus and rare variants. However, no significant association was found between rare variants’ burden and critical COVID-19. The leading variant of the HLA site, rs9271609, was located near the HLA-DRB1 and HLA-DQA1 gene origins, and only the HLA-DRB1*04:01 allele attained gene-wide significance. However, the colocalization for HLA-DRB1 was not significant according to TWAS analysis. Nevertheless, signals of 16 genes substantially colocalized with susceptibility.
GSMR analysis indicated that, of the 16 proteomic substantial associations, eight were significantly replicated in an external dataset. Additionally, mendelian randomization provided genomic evidence of a causal role for platelet activation (PDGFRL) and coagulation factors (F8) in critical COVID-19.
The study findings demonstrated genetic associations with distinct signals at their associated loci, implicating novel biological underlying mechanisms of severe COVID-19. Additionally, reduced IFN signaling was associated with severe disease. Therapies targeted at these genetic mutations and immunological alterations could provide novel strategies to mitigate COVID-19.