The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is subjected to the host’s deamination system that includes the cytidine-to-uridine (C-to-U) deamination by apolipoprotein B messenger ribonucleic acid (mRNA) editing enzyme, catalytic polypeptide like (APOBEC) enzymes and the adenosine-to-inosine (A-to-I) deamination by adenosine deaminases acting on ribonucleic acid (RNA) (ADARs).
Study: Fast evolution of SARS-CoV-2 driven by deamination systems in hosts. Image Credit: CROCTHERY / Shutterstock.com
Background
Since I is equivalent to G, both C-to-U and A-to-I deamination can lead to nucleotide substitution. Furthermore, these deaminations have already been found to cause ambiguity and uncertainty in evolutionary studies since they are practically indistinguishable. Therefore, these host-driven mutations help to determine the co-evolution of the host-parasite relationship, as well as in controlling the pandemic.
Among the polymorphic sites that exist in the SARS-CoV-2 genome, the number of C-to-U substitution sites is three times higher as compared to the number of A-to-I substitution sites. In fact, the ‘C–U/A–I’ ratio is extraordinarily high in SARS-CoV-2 as compared to other organisms with extensive RNA deamination.
Comparison of deamination events in the coding sequences (CDS) of SARS-CoV-2 and CDS of other organisms reveal that in most animals, the A–I sites are more frequent than the C-U sites, while in SARS-CoV-2 it is the opposite. A 60-fold higher representation of C-to-U deamination is observed in SARS-CoV-2 as compared to A-to-I deamination.
The occurrence of deamination events is known to be determined by the presence of cis-elements and trans factors. Furthermore, upon infection of SARS-CoV-2, both the host mRNA and the viral RNA are subjected to a trans environment. Therefore, the reason by which the deamination events are so different in the host and virus can be understood from the cis-elements.
A new study published in the journal Future Virology proposed that the RNA structure and the genome component may serve as an important basis of such different deamination spectrums that are observed in between SARS-CoV-2 and animals.
Role of RNA structure in promoting C-to-U deamination in SARS-CoV-2
The genome architecture has been found to vary significantly from species to species. In most animals, the GC content is higher than the AT content. However, in SARS-CoV-2, the opposite has been observed in that the AT content is higher than the GC content.
Since GC base pairs are biochemically more stable than the AT base pairs, RNA containing higher GC content are more likely to fold into a stable secondary structure and for double-stranded RNA (dsRNA). Therefore, mRNAs in animals are more likely to form dsRNA as compared to viral RNA that forms single-stranded RNA (ssRNA).
RNA deaminases are known to have their preference on RNA structures. For example, ADAR has a high affinity for stable dsRNA, while the cytidine deaminase, similar to APOBEC, tends to bind to ssRNA.
Therefore, in a cell that is infected by SARS-CoV-2, the endogenous mRNAs are targeted by ADAR, while the viral RNAs are targeted by APOBEC. Thus, although the host mRNA and viral RNA exist in the same environment, they undergo different fates due to different sequence features that further lead C-to-U deamination to become predominant in the SARS-CoV-2 genome.
The fast evolution of the SARS-CoV-2 genome may be due to higher deamination
It has been observed that the percentage of deamination in the human genome is much lower as compared to the SARS-CoV-2 genome. However, for both humans and SARS-CoV2, the deamination sites in the RNA sequence data are searched against the reference genome sequence.
Since the human RNAs are not heritable, the RNA deamination site in one generation might not be deaminated in the next generation. Therefore, the human RNA deamination sites cannot accumulate generation after generation.
However, the RNAs of SARS-CoV-2 are heritable. The RNA deaminations sites of one generation are used as a template for the next generation. Therefore, the deaminations can rapidly accumulate. Thus, different viral strains isolated from different patients often carry different mutations due to the random nature of deamination exerted by the host.
Impact of natural selection on SARS-CoV-2 evolution
The A-to-I and C-to-U substitutions rapidly accumulate in the viral genome depending upon its replication strategy. The single-stranded AT-rich regions are usually targeted by APOBEC and serve as hotspots for C-to-U deamination. This deamination leads to higher AT content since C is converted into U.
In the next generation, the AT content of this local RNA region will be increased further. This loop continues until all the cytidines are converted into uridine.
However, in reality, the evolution of SARS-CoV-2 is not so fast. Evolution is a combination of mutations and natural selection. While mutation occurs randomly, natural selection eliminates all deleterious mutations.
Under certain circumstances, the viral sequence with a C-to-U substitution might be less adaptive as compared to the sequence before mutation. Therefore, the faster mutation does not correspond to faster evolution all the time.
Future perspective
Although natural selection slows down the nucleotide substitution rate in viruses, host-driven deamination provides several chances for SARS-CoV-2 to determine which sequence is most suitable for invading human cells. If one viral sequence is found to have higher virulence, it is naturally selected.
Once the virus enters the host, cellular RNA deamination cannot be stopped. This results in fast mutation of the viral RNA. One effective way of alleviating this mutation is to block transmission of the virus.
If the viral RNAs cannot access the host cell, the mutation rate is significantly reduced. Finally, it can be concluded that even though many vaccines are now developed against SARS-CoV-2, prevention of virus transmission remains one of the powerful approaches to control the pandemic.