In a recent study published in the International Journal of Molecular Sciences, researchers explored RRIs [long-range ribonucleic acid (RNA)-RNA interactions] in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern (VOCs) genomes to assess evolutionary changes in SARS-CoV-2.
Background
RRIs are essential to CoVs’ life cycle, and their detection can expand understanding of the evolutionary characteristics of SARS-CoV-2 and potentially aid in predicting emerging VOCs since SARS-CoV-2 is an RNA virus. Recent in vivo studies evaluating the SARS-CoV-2 RNA structure have identified a few long-span RRIs; however, further investigation is required. Open reading frame 1a (Orf1a), and Orf1b (or Orf1ab) encode 16 non-structural proteins involved in SARS-CoV-2 replication and transcription and are a source of variation in the SARS-CoV-2 genome.
About the study
In the present study, researchers assessed VOC-specific evolutionary changes in the SARS-CoV-2 genome based on long-span RRI assessments.
SARS-CoV-2 genome sequences (n=32,714) of the Alpha VOC, Beta VOC, Delta VOC and Omicron VOC, originating from 92 different nations, were downloaded from the GISAID (global initiative on sharing all influenza data) database. Eight long-range Orf1a and/or Orf1b RRIs were selected for the analysis, four of which were experimentally validated (Exp) and obtained from the COMRADES experiment (Exp) and the remaining four were obtained from computational predictions (Comp) using the IntaRNA software.
Comp RRIs were analyzed to explore novel long-range RRIs and their associated changes in SARS-CoV-2 evolution. In addition, the frame-shifting element (FSE) was analyzed to assess for feature-specific evolution across VOCs. Mutational patterns within the sequences were assessed, initially within the entire genome and subsequently by VOC. RRI structural changes were evaluated by assessing VOC-specific compensatory mutations, conservation, and co-varying mutations for every RRI.
Computational estimations were performed once again, using SHAPE information to make a more constrained estimation. Genomic intervals were determined, and the top five hits for each genomic interval were recorded, following which the interactions were ranked based on energy residuals. Pairwise inter-VOC distances [the average number of differing bases for GISAID sequences] were calculated as a measure of dissimilarity.
In the eight long-span RRIs, no stem-loops or pseudoknots were observed. To improve understanding of how SARS-CoV-2 RRIs are conserved, irrespective of VOC, the average number of mutations per base for the sequences were compared for all eight RRIs. In addition, the mutations were categorized as compensatory or non-compensatory based on RNA base pair (bp) accommodation.
To investigate whether the RRIs experienced less/more variation or merely fall under the category of highly variable regions of the SARS-CoV-2 genome, the per base variations of the RRIs were compared to their neighbouring regions [100 nucleotides (nt) upstream and downstream]. Further, a co-variation analysis was performed, and the sequences were grouped by their corresponding VOCs to assess trends VOC-specific and VOC group-specific trends. To investigate if the RRIs were evolving VOC-specific, VOC sequences were separated and further analyzed by R-scape analysis.
Results
Heterogeneous mutational rates and evolutionary patterns were observed across RRI sites, indicative of differing rates of evolution and constraints on differing SARS-CoV-2 RNA sites. Exp1 and Exp4 demonstrated a relatively high amount of conservation, whereas Comp1 and Exp2 contained several compensatory mutations.
However, statistical R-scape analyses did not show significant co-variations for any RRI, and there was no significance for the top five hits using the SHAPE constraint. No evidence was found for the co-evolution of the two regions of an RRI within the sequences. Omicron, the most recent VOC, showed the greatest sequence divergence and the highest mutational rate (2.5 × 10−6 per base per day), and Delta showed the lowest mutational rate (1.96 × 10−6).
Alpha VOC and Beta VOC were found to have greater proximity to each other than other VOCs. Comp4 had the highest number of structural mutations (double of its neighbouring regions, but the lowest percentage (~12%) of non-structural mutations due to an “identifying mutation” (Q57H) within the Beta VOC orf3a), whereas Exp1 had the least number of mutations.
VOC-specific evolutionary changes were observed for a few long-range RRIs with evidence for the existence of Comp1 in the Beta VOC sequences. Comp2 was found to be the top hit without the SHAPE constraint. The mutational rate of FSE was comparable to that of the eight long-span RRIs. Unexpectedly, FSE showed more variations per base than the neighbouring regions, likely due to its super-structure. The Exp4 interval of Beta sequences, i.e., Beta Exp4 and Beta Comp1, showed significantly covarying base pairs, despite low statistical power.
There were VOC-wise differences for certain RRIs. For example, Comp1 had lower variation than the neighbouring regions for all VOCs except Delta, and Exp1 had lower conservation in Delta and Omicron but not in Alpha or Beta. Additionally, Comp4 had much less conservation in Omicron, while it was much higher for other VOCs. In Comp4, only three percent of Beta mutations accommodate the structure, whereas 90% of Alpha mutations do.
Conclusion
Overall, the study findings showed that long-range RRIs might undergo different evolutionary pressures in different SARS-CoV-2 VOCs and that Comp1 might be a new SARS-CoV-2 long-span RRI.