The ongoing coronavirus disease 2019 (COVID-19) pandemic, caused by the rapid outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has severely affected the global economy and healthcare system.
SARS-CoV-2 is an RNA virus with a high mutation rate, causing rapid evolution.
Background
Natural selection plays an important role in the virulence and transmissibility of SARS-CoV-2 through specifically adaptive mutations. Similar conditions have been reported in the case of Zika and Ebola viruses. Since the emergence of SARS-CoV-2 in the Wuhan Province of China in 2019, it has accumulated many genetic mutations.
Researchers have studied and documented the entire genome of the virus and have, thus far, reported around 29,735 nucleotide substitutions.
Scientists have observed that SARS-CoV-2 lineages have exhibited a high number of variations in transmission and clinical manifestations. Some of the SARS-CoV-2 variants show increased contagiousness compared to the original strain.
Additionally, certain SARS-CoV-2 variants can evade immune responses elicited by COVID-19 vaccination or natural infection. Considering these differences in traits, the World Health Organization (WHO) has classified SARS-CoV-2 variants as variants of concern (VOC) and variants of interest (VOI).
Scientists have expressed the importance of predicting epidemic trends, formulating effective strategies for disease control, and developing efficient COVID-19 vaccines to protect individuals from the disease. Additionally, they underscored the importance of understanding how natural selection drove the evolution of virulence and infectiousness of SARS-CoV-2 during the pandemic.
There is a gap in research related to the identification of the functional mutants that influence the evolution of epidemiological and pathogenic characteristics of SARS-CoV-2. Previous studies have reported two evolutionary hypotheses associated with the COVID-19 virus, which include the evolution of the virus within the animal host and in the human population after zoonotic transfer.
To date, evaluation of natural selection on SARS-CoV-2 has primarily focussed on the host shifting phase, i.e., from animals to humans. In this case, researchers have studied the sequence divergence between SARS-CoV-2 and closely related viruses, e.g., BatCoV-RaTG13. There is a lack of an effective method to evaluate indeterminate ancestral sequences, assess the clustering infections of SARS-CoV-2, and reduce the sampling bias.
The majority of available studies have conducted analyses based on allele frequency changes of individual mutations, which might not have occurred due to natural selection. Hence, researchers stated that it is important to determine the candidate mutant loci of natural selection. So far, the screening of the entire SARS-CoV-2 genome to determine the evolutionary landscape of the functional mutations and understand its effect on epidemiological perspective remains elusive.
About the study
Scientists have recently focused on determining natural selection on SARS-CoV-2 evolution based on a novel method. They stated that compared to the conventional methods that are based on founder effects, viral clustering infections, and sampling bias of viral genomic data, the current method is significantly improved.
In this study, researchers hypothesized that ongoing positive selection strongly influences SARS-CoV-2 genomes, which plays an important role in shaping the dynamics of the COVID-19 pandemic. This study is available as a pre-proof in Genomics, Proteomics & Bioinformatics.
In this study, researchers obtained SARS-CoV-2 sequences from the 2019 Novel Coronavirus Resource and the Global Initiative on Sharing All Influenza Data. They included 3,328,405 sequences from 169 countries. Scientists used MUSCLE to align these sequences and determined nucleotide mutations by comparing the sequences with the reference sequence, i.e., the sequence of the original SARS-CoV-2 strain.
Scientists partitioned these viral sequences into clusters according to genomic similarity based on global transmissibility and clustering outbreaks. They constructed a temporal and spatial landscape of mutations on top of the clusters, which helped determine mutations that are pathogenic and cause severe or altered clinical symptoms.
Findings
Researchers compared the relative excess of non-synonymous and synonymous substitutions as an efficient method for determining the effect of natural selection on SARS-CoV-2. This method is logically similar to the McDonald–Kreitman test in molecular evolution. The method proposed in the current study has been referred to as the NSRF1 method, which compares genetic polymorphisms within a species.
NSRF1 is a novel method that determines the relative abundance of Nucleocapsid and Spike (Nm/Sm ratio) protein between mutations with high and low allele frequencies. Researchers examined the increasing or decreasing trend of Nm/Sm ratios with the enhanced mutant allele frequencies. In this context, scientists assumed that the mutations with higher frequencies tend to undergo a longer duration of natural selection.
The findings of the study indicate that the ongoing positive selection is responsible for the affinity with humans with regards to enhanced transmissibility and escape of host antiviral immunity.
Concluding remarks
The authors stated that proportionally increasing or decreasing Nm/Sm ratio is a more efficient indicator of natural selection. The current study presented multiple pieces of evidence that showed SARS-CoV-2 genomes are constrained under purifying selection during the pandemic.
Importantly, the study provided a list of 556 mutations as putative target sites of natural selection. Scientists revealed that mutations in between cluster divergence or within-cluster frequency enhance pathogenicity and infectivity. This list provides a foundation for future studies related to clinical treatment.