In a recent study published in Science, the researchers demonstrated that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) zoonotic spillover to humans involved two separate cross-species transmission events, and lineages A and B viruses.
Background
It is crucial to understand the circumstances which lead to a pandemic to prevent future pandemics. In the case of the coronavirus disease 2019 (COVID-19) pandemic, which began in late 2019 in the Huanan market of Wuhan, China, the diversity of SARS-CoV-2 quickly increased, leading to the emergence of multiple variants of concern (VOCs). However, its initial phase was marked by only two major lineages, 'A’ and ‘B.’
The reference genome, Wuhan/Hu-1/2019, and the earliest genome, Wuhan/IPBCAMS-WH-01/2019, sampled on 24 and 26 December 2019, respectively, belonged to lineage B, which remained the most common throughout the pandemic. Later, two samples collected on 30 December 2019 and 5 January 2020 from Wuhan showed the presence of lineage A viruses.
While SARS-CoV-2 lineage B has a ‘C/T’ pattern at nucleotide positions, C8782, T28144, lineage A viruses have a ‘T/C’ pattern at nucleotide positions, C8782T, T28144C. Studies have not answered several questions about the evolution of these two SARS-CoV-2 lineages. For instance, why did lineage B predominate in the early pandemic phase despite being distantly related to sarbecoviruses from Rhinolophus bats, the presumed host reservoir of SARS-CoV-2?
About the study
In the present study, researchers gathered genomic and epidemiological data from the early phase of the COVID-19 pandemic to determine the ancestral haplotype and the genomic characteristics of the most recent common ancestor (MRCA) of SARS-CoV-2 to help understand the evolution of lineages A and B.
They deployed phylodynamic rooting methods combined with epidemic simulations to study SARS-CoV-2 genomic diversity before February 2020. The researchers reconstructed the genome of a hypothetical progenitor of SARS-CoV-2 to study its mutational pattern. Further, they used a random-effects substitution model to infer the ancestral SARS-CoV-2 haplotype. Finally, the researchers inferred the time of the lineage B and lineage A primary cases, accounting for the symptom onset date and earliest documented COVID-19 hospitalization date.
Study findings
By February 14, 2020, the researchers identified 787 near-full length genomes from SARS-CoV-2 lineages A and B. Due to convergent evolution, the authors continually observed C/C and T/T genomes throughout the pandemic. The genome of the recombinant common ancestor (“recCA”) differed from Hu-1 by just 381 substitutions, including C8782T and T28144C. This finding indicated that genetic similarity to related viruses was a poor substitute for the ancestral haplotype. The authors observed 23 unique reversions and 631 unique substitutions across the SARS-CoV-2 phylogeny by February 2020.
The authors also successfully inferred the ancestral haplotype of the 787 lineage A and B genomes sampled by 14 February 2020. Phylodynamic rooting favored a lineage B or C/C ancestral haplotype, although lineage B exhibited more divergence from the root of the SARS-CoV-2 phylogeny tree than expected. Further, it showed that a lineage A ancestral haplotype was inconsistent with the molecular clock, with Bayes factor (BF) = 48.1. Due to the C-to-T transition bias, the T/T ancestral haplotype, with BF>10 was also disfavored. Epidemic simulations did not support the notion that a single introduction of SARS-CoV-2 gave rise to the observed phylogeny.
The researchers could infer only three possible ancestral haplotypes -lineage A, lineage B, and C/C. Further, they inferred the time of most recent common ancestor (tMRCA) for SARS-CoV-2 to be 11 December 2019 which remained consistent in the recCA-rooted and fixed ancestral haplotype studies. The genomes sequenced early during the pandemic had 35.2% and 64.8% of lineages A and B, respectively. Moreover, they had a large polytomy, which refers to the descending of multiple lineages from a single node on the phylogenetic tree. The infection dates of the lineage B and lineage A primary cases were 18 and 25 November 2019, respectively. In 64.6% of the posterior samples, the lineage B cases predated lineage A cases by an average of seven days.
Conclusions
The study results defined a narrow window during which SARS-CoV-2 first spilled over into humans to cause the first cases of COVID-19. Interestingly, it involved two independent zoonotic events, with lineage A and B progenitor viruses co-circulating in non-human mammals before their spillover into humans. The first event occurred around 18 November 2019 and involved lineage B viruses, while the second occurred within weeks of the first event and involved lineage A viruses. Additional, cryptic introductions also likely accompanied these two zoonotic events, and the authors also raised the possibility of failed introductions of intermediate SARS-CoV-2 haplotypes.
Multiple studies have demonstrated that SARS-CoV-2 has the potential to reverse zoonosis in Syrian hamsters and white-tailed deer. Overall, these findings suggest that SARS-CoV-2 did not have to adapt within humans to spread.