A knowledgebase of currently known categorized COVID-19 severity variants

Download PDF Copy

By Pooja Toshniwal PahariaReviewed by Danielle Ellis, B.Sc.Nov 10 2022

In a recent study posted to the medRxiv* preprint server, researchers assessed human genome variants related to the susceptibility and severity of coronavirus disease 2019 (COVID-19).

Studies have reported that the genomic susceptibility of the host can increase the risk of severe SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) infections. Many studies have been conducted on host genetics for COVID-19 susceptibility; however, data on COVID-19-related variants are limited, and a database of variants stratified by confidence levels is lacking. In addition, computational tools to predict severe COVID-19-associated variants are currently unavailable.

Study: A comprehensive knowledgebase of known and predicted human genetic variants associated with COVID-19 susceptibility and severity. Image Credit: Orpheus FX / Shutterstock

*Important notice: medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

About the study

In the present study, researchers explored the genetic factors underlying host susceptibility to the severity of SARS-CoV-2 infections.

The biological functions of SARS-CoV-2 infection susceptibility/severity genes were explored using gene enrichment, feature importance, network, and pathway analyses. In addition, the team conducted phenome-wide association studies (PheWAS) on 39,386 individuals genotyped by the Mount Sinai BioMe BioBank to evaluate the pleiotropic effects of SARS-CoV-2 infection-associated variants and identify physiological similarities between COVID-19 and associated disorders.

A severe COVID-19 variant classifier based on machine learning was developed for estimating severe COVID-19-associated variants from 82,468,698 human genomic missense variants. Further, a SARS-CoV-2 infection-associated-host genomic variants website was created for searching, submitting, and downloading COVID-19 susceptibility-associated genetic variants. The classifier’s estimations were based on SHAP (Shapley-value-based explanations) and feature importance analysis.

COVID-19-associated genetic variants were categorized into four categories: (i) mild or asymptomatic SARS-CoV-2-associated variants; (ii) variants that could elevate symptomatic COVID-19 risks; (iii) known severe COVID-19-associated variants, e.g., those associated with critical COVID-19-associated pneumonia and ICU (intensive care unit) admissions; and (iv) variants involved in structural destabilization of proteins related to SARS-CoV-2 infection susceptibility.

Based on confidence levels, the variants were categorized into the following categories: (i) CAV (COVID-19-associated variants), (ii) CAV-FE (CAV with functional evidence), (iii) Allele frequency-FCP (COVID-19 prevalence correlation), (iv) IP (in silico prediction) and (v) Allele frequency – FCP + IP. CAV and CAV-FE category variants were identified through candidate gene approaches and association studies. In addition, the team identified FCP variants in studies investigating the association between the probable COVID-19-associated variant frequency and the prevalence of SARS-CoV-2 infections in several populations.

IP category deleterious variants were identified in studies exclusively using in-silico approaches for estimating the effects of amino acid exchanges on the susceptibility of SARS-CoV-2 infections. CAV-FE variants and HGMD (human gene mutation database) known disease-causing pathological mutations were utilized for creating a machine-learning classifier of severe COVID-19-related variants. Further, PPI (protein-protein interaction) networks, biological functions, and diseases significantly enriched by high-confidence COVID-19 genes were evaluated. Finally, the LD (linkage disequilibrium)-based clustering was performed to identify COVID-19-associated variants.

Results

Text mining yielded 1,977 relevant publications and 222 eligible studies, from which 820 COVID-19-associated host genetic variants reported to affect COVID-19 susceptibility were obtained, 719 of which were present in 295 genes, and 101 were present in intergenic sites. By confidence evaluation, 196 high-confidence variants were obtained. Conservation scores, MAF (minor allele frequency), SNVs (single nucleotide variants), and genome-level evolutionary pressures showed the most significant impacts on COVID-19 susceptibility/severity variant estimation.

Genes with high-confidence COVID-19 susceptibility variants shared networks, pathways, biological functions, and diseases, and the categories of infectious diseases and the immunological systems showed the highest significance. Pre-existing thromboembolism and chronic hepatic disease could elevate COVID-19 severity risks.

Compared to pathogenic variants not associated with COVID-19, CAV-FE variants were observed at significantly less conserved sites, with MAF> 0.1 variants within 100 to 1000 base-pairs, lower de novo mutational excess rates, lower indispensability scores, lower H3K36me3 levels, and were less likely to be associated with a disordered protein segment.

In total, 117 significantly over-represented pathways, among which, pathways for IFN-α/β (interferon-alpha/beta) signaling, toll-like receptor 4 (TLR4) signaling, and TBK1 (TANK-binding kinase 1) /IKK (IκB kinase) epsilon-mediated interferon regulatory transcription factor (IRF)3/IRF7 activation were the most significantly over-represented. Pathways of hypercytokinemia/hyperchemokinemia in influenza pathogenesis, coronavirus pathogenesis, neuroinflammation signaling, and pathogen-induced cytokine storm signaling were the most significant pathways.

The most significantly enriched human phenotype ontology (HPO) terminology was ‘recurrent viral infections. LD-based analysis showed that 285, 286, and 288 variants were independently associated with COVID-19 among African Americans, European Americans, and Hispanic Americans across 458, 466, and 629 phenotypes, respectively.

Overall, the study findings showed a comprehensive SARS-CoV-2 infection-related human genomics knowledge base, with a machine learning-based classifier and predetermined estimations for host genomic missense variants based on gene-, variant-, network-, and protein-level features.

Journal reference:

Preliminary scientific report. A comprehensive knowledgebase of known and predicted human genetic variants associated with COVID-19 susceptibility and severity. Meltem Ece Kars, David Stein, Çiğdem Sevim Bayrak, Peter D Stenson, David N Cooper, Yuval Itan. medRxiv preprint 2022, DOI: https://doi.org/10.1101/2022.11.03.22281867, https://www.medrxiv.org/content/10.1101/2022.11.03.22281867v1

Posted in: Medical Science News | Medical Research News | Disease/Infection News

Comments (0)

Written by

Pooja Toshniwal Paharia

Pooja Toshniwal Paharia is an oral and maxillofacial physician and radiologist based in Pune, India. Her academic background is in Oral Medicine and Radiology. She has extensive experience in research and evidence-based clinical-radiological diagnosis and management of oral lesions and conditions and associated maxillofacial disorders.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Toshniwal Paharia, Pooja Toshniwal Paharia. (2022, November 10). A knowledgebase of currently known categorized COVID-19 severity variants. News-Medical. Retrieved on February 06, 2026 from https://www.news-medical.net/news/20221110/A-knowledgebase-of-currently-known-categorized-COVID-19-severity-variants.aspx.
MLA
Toshniwal Paharia, Pooja Toshniwal Paharia. "A knowledgebase of currently known categorized COVID-19 severity variants". News-Medical. 06 February 2026. <https://www.news-medical.net/news/20221110/A-knowledgebase-of-currently-known-categorized-COVID-19-severity-variants.aspx>.
Chicago
Toshniwal Paharia, Pooja Toshniwal Paharia. "A knowledgebase of currently known categorized COVID-19 severity variants". News-Medical. https://www.news-medical.net/news/20221110/A-knowledgebase-of-currently-known-categorized-COVID-19-severity-variants.aspx. (accessed February 06, 2026).
Harvard
Toshniwal Paharia, Pooja Toshniwal Paharia. 2022. A knowledgebase of currently known categorized COVID-19 severity variants. News-Medical, viewed 06 February 2026, https://www.news-medical.net/news/20221110/A-knowledgebase-of-currently-known-categorized-COVID-19-severity-variants.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.