In a recent study posted to the bioRxiv* preprint server, researchers evaluated all biochemical changes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike (S) protein to explore their evolutionary significance.
*Important notice: bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.
Background
The S protein dominates the exterior surface of the SARS-CoV-2, hence had to change itself physically and biochemically the most to avoid immune recognition. In addition, there is a possibility that the second level of selective pressure acts on the S protein to help SARS-CoV-2 evolve and adjust to the physical transmission between humans.
Over the course of the coronavirus disease 2019 (COVID-19) pandemic, SARS-CoV-2 S protein has evolved from a protein with a total charge of -8.28 to -1.26 in the original lineages A and B, and the most recent Omicron variant of concern (VOC), respectively. Two previous studies by Pawłowski in 2021 and Nie et al. in 2022 had noted this pattern.
About the study
In the present study, researchers used over 11 million SARS-CoV-2 genomic sequences retrieved up to June 15, 2022, to monitor changes in the S protein, including its charge, size, hydrophobicity, and folding over two years of SARS-CoV-2 evolution. They performed a principal component analysis (PCA) to identify the physical features of S most strongly linked with SARS-CoV-2 lineages linked to genomic surveillance data.
The team extracted intact S protein sequences and collected them into a matrix. They also investigated changes in charge of S protein throughout the pandemic and plotted the total S charge for all genomes per month of the pandemic. Furthermore, the researchers retrieved the S coding region of all available coronavirus (CoV) 229E full genomes sequences from GenBank.
Study findings
The PCA identified eight top features, including charge, gravy, fraction T, R, D, K, and G, and instability clustered SARS-CoV-2 S sequences by lineage. The results also highlighted the S protein charge as the key determinant of all the SARS-CoV-2 lineages that evolved during the first two years of the COVID-19 pandemic.
The results showed a clear pattern of increase in S charge over two years of SARS-CoV-2 evolution, with the first surge between early 2020 and March 2020. During this time, the researchers noted an increase in positive charge to -7.28. Subsequently, they observed an additional increase of -3.28 in positive charge in mid-2021. The last round of increase in S positive charge in early 2021 brought it to around -1.26. The study results pointed to each round of increase in positive charge of S protein to the major SARS-CoV-2 lineages reported over time. For instance, the majority of Omicron sub-variants showed S protein positive charge increases of around -1.3.
The location of the substitutions in the S protein indicated the functional consequences of the observed changes in the S charge. The authors noted an initial change in S charge was associated with the substitution of an aspartic acid residue (D, -1) with a glycine (G, neutral) relative to the initial Lineage B genome sequences. While the Delta lineage S encoded additional positive charge in the angiotensin-converting enzyme 2 (ACE2) binding region and heptad repeat (HR1) region, Omicron S showed a predominance of positive charge changes in the receptor-binding domain (RBD).
Comparing 229E CoVs infecting camels and humans revealed a difference of almost nine charge units in S median charge. However, they observed no difference in S charge for the Middle East Respiratory Syndrome coronavirus (MERS-CoV) infecting humans and camels.
Conclusions
The current study highlighted how by increasing the positive charge on its S, SARS-CoV-2 interacts with the negatively charged matrix of mucins in the human upper respiratory tract and promotes binding and transmission inside the host cells. Since MERS-CoV has not made similar changes in the S charge, its human transmission chains end after two to three transmission events.
The study also evidenced that the broad location of the positive changes across the S protein sequences of OC43, 229E, and SARS-CoV-2 might be responsible for receptor binding, furin cleavage, cell fusion, and antigenic changes that help avoid or promote ionic interactions during human to human transmission. However, due to functional constraints, the SARS-CoV-2 S protein will eventually settle at an upper limit of charge. Perhaps this is why all Omicron sublineages, after more than six months of evolution, have now settled into the range of S charge of around -1.
To summarize, the study provides a much-needed framework to monitor CoV evolution through changes in biochemical properties of their S proteins. This data could help trace other viruses threatening public health globally. Additionally, the study findings could help predict the timing of the emergence of new SARS-CoV-2 variants based on more positively charged changes in S protein.
*Important notice: bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.