A recent study published in the PeerJ journal identified unique truncated open reading frame 8 (ORF8) proteins among severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences.
Various studies have reported that ORF8 increases the odds of coronavirus disease 2019 (COVID-19) infections by helping SARS-CoV-2 in evading immune response and facilitating viral replication. However, the effect of the ORF8 truncation on human immunity remains unclear.
About the study
The present study aimed to identify and characterize the unique variations found in truncated ORF8 proteins (T-ORF8) produced as a result of the Q27STOP mutation.
The team obtained 49,055 complete T-ORF8 protein sequences sourced from the continents of Africa, Asia, Europe, North America, and South America. This was followed by the extraction of unique T-ORF8 sequences found in each continent, wherein these were defined as T-ORF8 sequences different from other sequences as per the amino acid arrangement in them.
Every amino acid found in the T-ORF8 sequences was recognized as either polar (Q) or non-polar (P), making every unique T-ORF8 a sequence of P and Q symbols. The team then determined the homology of these sequences and associated each with the nearest phylogenetic T-ORF8 variant.
The frequency of every amino acid found in the T-ORF8 sequence was examined using a standard bioinformatics routine. The distance matrix calculated in this routine was used to establish a phylogenetic relationship between a T-ORF8 and the nearest neighbor. The team also prepared phylogenetic data using deep phylogeny analyses and calculated phylogeny estimations and evolutionary rate differences among the unique T-ORF8 sequences.
The team further estimated the degree of conservation of amino acids found in the T-ORF8 proteins using Shannon’s entropy (SE) along with the physicochemical and molecular properties of the proteins.
Results
The study results showed that 47 unique T-ORF8 protein sequences were found among the total sequences obtained from the five continents. It was observed that truncation at positions 23, 25, 27, 41, and 42 in ORF8 proteins generated T-ORF8 proteins. The study noted that amino acids glutamine (Gln) and cysteine (Cys) present at four positions were truncated due to mutational changes while valine (Val) was also truncated due to three mutations.
The ORF8 protein sequence P15 was observed in North American SARS-CoV-2 B.1.17 lineage samples. Also, the P15 variant containing the Q27STOP mutation was found in the B.1.1.7 lineage with frequencies of 108, 99, 156, and 1 in the samples obtained from Africa, Asia, Europe, and South America, respectively. The study also noted that the T-ORF8 P15 variant was unique and prevalent in B.1.17 lineages across North America, among other continents.
Alignment of the amino acid sequences and phylogenetic analysis of the unique T-ORF8 variants showed that all these variants shared similar amino acids like methionine (Met), serine (Ser), lysine (Lys), leucine (Leu), and Gln. Also, the P15 variant was phylogenetically closest to the P13 and P14 variants.
Deep phylogenetic analysis of the first nucleotide dataset of the T-ORF8 sequences showed that 48 nucleotides and 83 amino acid positions were the most similar unique sequences tested. Also, nine T-ORF8 variants were indicated to be related to specific SARS-CoV-2 variants or a particular variant subtype, like the SARS-CoV-2 Alpha variant. Furthermore, the second nucleotide dataset had 47 sequences that represented different SARS-CoV-2 ORF8 variants.
Analyzing the predisposition of distribution of per-residue intrinsic disorder within the 47 T-ORF8 sequences showed that the nitrogen and the carbon-terminal regions in the proteins have higher levels of intrinsic disorders compared to the central parts. Also, the study found similar profiles in the T-ORF8 proteins except for the P1 and P36 variants whose N-terminal regions exhibit high levels of intrinsic disorder and P45 whose N-terminal had the least disorder. Moreover, in the C-terminal region of the T-ORF8 protein, P25 had the longest distribution of disorder while P18 and P19 had long stretches of disorder and P12 had the least disorder levels in this region.
The study also found a balanced number of Q and P amino acid residues in the unique T-ORF8 variants. Among the T-ORF8 variants having 26 amino acid residues, 14 positions remained invariant. Also, only 17 T-ORF8 variants had unique polar/non-polar sequences except the P4, P5, P15, and P28 variants.
Conclusion
The study findings showed that the inhibition of the SARS-CoV-2 ORF8 protein can not only boost antiviral immune response but also improve the eradication of SARS-CoV-2 in vivo.
The researchers believe that further studies are essential to investigate the long-term clinical effects of T-ORF8 and the potential development of an anti-ORF8 therapy to treat COVID-19.