Researchers develop a website to identify SARS-CoV-2 regional clusters in real-time

Download PDF Copy

Revised

By Neha MathurReviewed by Aimee MolineuxJan 13 2022

In a recent study posted to the medRxiv* pre-print server, a team of researchers developed a phylogenetics-based website to identify new severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) strains quickly and efficiently in a region.

Study: Identifying SARS-CoV-2 regional introductions and transmission clusters in real time. Image Credit: Dotted Yeti/Shutterstock

In the absence of advanced phylogenetic and analytical tools, the SARS-CoV-2 global sequencing efforts have witnessed a setback. The existing methods for phylogenetic analysis could handle only small and static datasets. Also, they were computationally too expensive to identify clusters of closely related samples and the ever-expanding datasets of densely sampled pathogens, including SARS-CoV-2.

This news article was a review of a preliminary scientific report that had not undergone peer-review at the time of publication. Since its initial publication, the scientific report has now been peer reviewed and accepted for publication in a Scientific Journal. Links to the preliminary and peer-reviewed reports are available in the Sources section at the bottom of this article. View Sources

Even when results were available, these analyses were not readily interpretable for an efficient public health response due to a lack of intuitive visualization and data exploration tools. Overall, there is an unmet need for high-throughput tools that could mount an effective public health response by quickly interpreting the available data, letting public officers take a well-informed public health action.

About the study

The regional index (C) was the core of the phylogenetically informed summary heuristic developed for the study. It is a weighted summary of the composition of descendants of a node of a phylogenetic tree, roughly corresponding to the virus represented by that node was inside or outside a specific area.

When a descendent leaf is genetically identical to the internal node and is inside a specific region, C is equal to one, or else C was equal to zero. The researchers applied additional rules to handle cases where C was undefined. The index calculation is not applicable for leaf nodes, for which accurate geographic location metadata is not available.

Using this method, the researchers traced SARS-CoV-2 transmission clusters in 102 countries using the global parsimony phylogenetic tree, built from 5,563,847 available sequences of SARS-CoV-2 on GISAID, GenBank, and COG-UK25 on 28 November 2021. Cluster size, with ~20% of distinct regional clusters containing 89% of samples, appeared highly skewed, suggesting that novel viral introductions do not essentially lead to the establishment of a locally circulating new strain.

Findings

Over 50% of samples of the genome sequence repositories originated from the USA or the UK, substantially restricting the global transmission analysis, as the inference of a cluster’s origin is dependent on the robustness of sequencing at the origin. Therefore, the researchers focused on the US data, where sequencing across each state was relatively comprehensive and robust, and detailed state-level metadata was available for most samples.

As of November 2021, over 3,00,000 distinct state-level SAR-CoV-2 infection clusters were found in the USA from the beginning of the pandemic. Of these, 84% of clusters had an assigned origin, and 7% of clusters had an international origin, with the majority reflecting transmission within the USA. As expected, Mexico and Canada were among the most common international origin regions, given their long land borders. England was also relatively common because it is well-sampled. These findings suggested that sequencing effort in a given region creates a bias for accurately identifying the origin of new clusters.

The most significant achievement of this work was the development of Cluster-Tracker, an open-source, daily updated website. This website assisted the exploration and prioritization of the latest genome sequences from across the USA, quickly identifying the clusters most likely to be of interest for public health action. Any user could use this website and its flexible backend pipeline to construct a similar site for any set of regions (e.g. country-level), allowing people to explore SARS-CoV-2 phylogenetic data.

Conclusions

The open-source tools, methodologies, and software package described in the study could prove immensely useful for researchers worldwide. The researchers could draw inferences from vast sequence datasets quickly, explore the geographic structures to draw inferences in the context of the spread of SARS-CoV-2, even other densely sampled pathogens in specific areas within the global SARS-CoV-2 phylogeny. In addition, this analytical approach performed well on simulated data and was congruent with a more sophisticated analysis performed during the pandemic.

More importantly, the researchers presented an accessible open-source interactive interface for their results, which could automatically compute and display introductions and clusters with each update to the global phylogenetic tree.

To summarize, this work will empower public health officers to explore the spread of SARS-CoV-2 across the USA and even support public health groups globally to quickly understand and apply insights obtained from the most recent genomic data.

Journal references:

Preliminary scientific report. Jakob McBroome, Jennifer Martin, Adriano de Bernardi Schneider, Yatish Turakhia, Russell Corbett-Detig. (2022). Identifying SARS-CoV-2 regional introductions and transmission clusters in real time. medRxiv. doi: https://doi.org/10.1101/2022.01.07.22268918 https://www.medrxiv.org/content/10.1101/2022.01.07.22268918v1
Peer reviewed and published scientific report. McBroome, Jakob, Jennifer Martin, Adriano de Bernardi Schneider, Yatish Turakhia, and Russell Corbett-Detig. 2022. “Identifying SARS-CoV-2 Regional Introductions and Transmission Clusters in Real Time.” Virus Evolution 8 (1). https://doi.org/10.1093/ve/veac048. https://academic.oup.com/ve/article/8/1/veac048/6609172.

Article Revisions

May 10 2023 - The preprint preliminary research paper that this article was based upon was accepted for publication in a peer-reviewed Scientific Journal. This article was edited accordingly to include a link to the final peer-reviewed paper, now shown in the sources section.

Posted in: Medical Research News | Medical Condition News | Disease/Infection News

Comments (0)

Written by

Neha Mathur

Neha is a digital marketing professional based in Gurugram, India. She has a Master’s degree from the University of Rajasthan with a specialization in Biotechnology in 2008. She has experience in pre-clinical research as part of her research project in The Department of Toxicology at the prestigious Central Drug Research Institute (CDRI), Lucknow, India. She also holds a certification in C++ programming.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Mathur, Neha. (2023, May 10). Researchers develop a website to identify SARS-CoV-2 regional clusters in real-time. News-Medical. Retrieved on February 08, 2026 from https://www.news-medical.net/news/20220113/Researchers-develop-a-website-to-identify-SARS-CoV-2-regional-clusters-in-real-time.aspx.
MLA
Mathur, Neha. "Researchers develop a website to identify SARS-CoV-2 regional clusters in real-time". News-Medical. 08 February 2026. <https://www.news-medical.net/news/20220113/Researchers-develop-a-website-to-identify-SARS-CoV-2-regional-clusters-in-real-time.aspx>.
Chicago
Mathur, Neha. "Researchers develop a website to identify SARS-CoV-2 regional clusters in real-time". News-Medical. https://www.news-medical.net/news/20220113/Researchers-develop-a-website-to-identify-SARS-CoV-2-regional-clusters-in-real-time.aspx. (accessed February 08, 2026).
Harvard
Mathur, Neha. 2023. Researchers develop a website to identify SARS-CoV-2 regional clusters in real-time. News-Medical, viewed 08 February 2026, https://www.news-medical.net/news/20220113/Researchers-develop-a-website-to-identify-SARS-CoV-2-regional-clusters-in-real-time.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.