In a recent study published in Nature Ecology & Evolution, researchers harnessed publicly available viral genomic data, using a comprehensive suite of network and phylogenetic analyses to investigate the evolutionary mechanisms underpinning recent viral host jumps.
Study: The evolutionary drivers and correlates of viral host jumps. Image Credit: Kateryna Kon/Shutterstock.com
Background
Viruses found in non-human vertebrates frequently cause infectious diseases, outbreaks, epidemics, and pandemics when they spread to individuals. Zoonotic host jumps, or the transmission of viruses from wild and domestic animal populations to people, have considerably impacted human health.
Current knowledge is insufficient to forecast, prevent, and control future infectious disease concerns since only a tiny proportion of the viral variety has been defined, with surveillance studies lacking geographical and temporal coverage. Understanding the evolutionary processes underlying host leaps may help reduce these impacts.
About the study
In the present study, researchers analyzed 12 million sequenced viruses and host information provided by the National Center for Biotechnology Information (NCBI) to evaluate worldwide viral genomic monitoring.
They identified overarching patterns in viral host jump directionality among humans and non-human vertebrate organisms and assessed observable adaptability associated with potential host jumps.
The researchers investigated adaptive evolution signs in viral proteins that facilitate or maintain host leaps. They obtained information on all viral sequences available on NCBI Virus (n = 11,645,803) to determine the extent of acquired viral genomic information.
The researchers then collected 58,657 quality-controlled viral genomes from NCBI Virus, covering 32 viral families and 62 vertebrate host orders, accounting for 24% of all vertebrate viral species.
They used a species-agnostic network theory technique to identify viral cliques that are discrete taxonomic groupings with similar levels of genetic variation.
The researchers found potential host jumps within these viral cliques using curated whole-genome alignments and maximum-likelihood phylogenetic reconstruction.
They accounted for the most frequent directional selection measure at the genome level, i.e., the proportion of non-synonymous-type amino acid substitutions in each non-synonymous region (dN) to that of synonymous substitutions for every synonymous site (dS).
The researchers investigated whether the intensity of selection associated with a host hop reduces for viruses with broad host ranges. They used a linear model to predict log10 (dN/dS) differences between host and non-host leaps and clique membership effects to estimate potential adaptation signals associated with lineages that have experienced host jumps for the various gene categories.
The researchers expected that within each gene, adaptative alterations would be limited to functionally critical areas or subjected to significantly higher selective pressures from host immunity.
Results
The study found that individuals are both a source and a sink for viral spillover events, with more viral host transfers from humans to other species than from animals to humans.
For viruses with greater host ranges, the level of adaptation associated with a host jump is smaller, with structural or auxiliary genes serving as primary selection targets. The work exposes considerable gaps in worldwide viral genomic animal surveillance, underscoring the need for meticulous sample metadata reporting.
The bulk of viral sequences in NCBI (68%) were linked to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), demonstrating extensive sequencing during the coronavirus disease 2019 (COVID-19).
This collection contained 93% of vertebrate-related viral sequences, and most (93%) were associated with humans. The four viruses with the most sequenced genomes (Gallus, Sus, Anas, and Bos) were linked to domestic animals, and 15% of viral sequences were from vertebrates.
User-uploaded host metadata for viral sequences remains inadequate, with 37% and 45% of viral genomic sequences from non-human hosts lacking related host information at the sample collection period and genus levels, respectively. The fraction of missing information varies significantly between virus families and nations.
The researchers discovered 5,128 viral cliques spanning 32 viral families highly similar to ICTV-defined species. Despite the human-centric nature of genomic monitoring, viral cliques involving just animals account for 62% of all cliques, demonstrating the wide range of animal viruses in the worldwide viral-sharing network.
The study showed that the minimal mutational distance for a putative host leap inside every viral clique was much higher than that of non-host jumps, showing that adaptation measurements were not biased.
The observed host range for each viral clique was favorably correlated with higher sequencing intensity, indicating a significant positive relationship between the inter-host diversity of viral organisms and surveillance efforts.
The intensity of adaptation signals differed by family, with structural proteins in coronaviruses showing the highest signals and auxiliary proteins in paramyxoviruses.
Conclusion
The study findings show that genomic data in the public domain helps understand viral host jumps, but there are gaps in understanding viral diversity. 81% of potential host jumps identified do not involve humans, highlighting the global viral-sharing network's scale.
Investigating the flow of viruses within this network could provide insights into managing infectious disease emergence at the human-animal interface. The study found that humans transmit more viral organisms to animals than humans, and multi-host virus-host jumps require fewer adaptations.
The taxonomy-agnostic approach identified cliques consistent with traditional viral species nomenclature but also highlighted inconsistencies. Monitoring human-to-animal transmission of viruses is crucial for managing infectious diseases.