In a recent study published in Scientific Reports, researchers identified serological Hepatitis C virus (HCV) signatures and human immunodeficiency virus (HIV) signatures through the secondary utilization of ribonucleic acid sequencing (RNA-seq) analysis data among previous and existing smokers with or without COPD (chronic obstructive pulmonary disease).
Background
Viral detection by RNA sequencing analysis has increased the knowledge base of viruses causing human infections. Identifying undiagnosed viral infections by using existing nucleic acid sequencing data could facilitate epidemiological survey-based analysis and aid in the development of diagnostic and therapeutic options for improved population health.
About the study
In the present study, researchers detected HCV and HIV signatures in peripheral blood using repurposed ribonucleic acid-sequencing data among smokers with or without COPD.
The team evaluated the associations between the identification of viruses and their corresponding infections with disease outcomes. Gene expression analysis was performed using reads unmapped to the genome of humans. Interferon (IFN) scores were determined using the Genetic Epidemiology of COPD (COPDGene) study’s RNA sequencing data to determine the association between disease identification and responses by the host.
The COPDGene trial comprises>10,000 non-Hispanic White individuals and African American individuals with previous or current smoking habits, recruited at 21 centers in the US (United States) and followed up for five years. Follow-up assessments included spirometry analysis, questionnaires, chest CT (computed tomography) scans, CBC (complete blood cell) analysis, and ribonucleic acid sequencing analysis.
The GATK4 (genome analysis toolkit 4) pipeline was used for viral detection. The reference human virome comprised 12,148 genomes from the NCBI (national center for biotechnology information) database, and NCBI taxonomy was used for the viral genome data. Transcriptomic signatures of host responses were identified using GSVA (gene set variation analysis) based on composite IFN scores for the IFN-α and IFN-β pathways. The IFI27 gene expression was analyzed for HIV infections.
Results
Blood ribonucleic acid sequencing information was obtained from 3,984.0 samples, including 1,601 and 1,609 samples of COPD patients and controls, respectively, among whom 25.0% were African American and 33.0% had current smoking habits. Viral ribonucleic acid was detected in blood RNA-seq data from COPD-affected previous or current smokers.
The team observed greater IFN scores among HCV-infected and HIV-infected individuals and association with viral infections, outcomes, and host responses, indicative of the method’s validity. HCV RNA and HIV RNA were detected among 228 individuals and 30 individuals, respectively. In total, 31 viruses, including CMV (cytomegalovirus) and EBV (Epstein-Barr virus), were detected, with ≥1.0 mapped viral read to the host virome a minimum of two individuals.
Of 228 individuals among whom HCV RNA was detected, 77 individuals reported hepatic illness (out of 189 hepatic disease patients in total), indicative of significant viral enrichment, and that the method was highly specific. However, among 112 individuals with self-documenting hepatic illness, HCV RNA was not detected, indicating that the method had low sensitivity. HCV RNA levels could be lower among treated individuals.
Individuals with detectable HCV RNA showed an increased likelihood of being young, African-American male smokers with fewer co-morbid conditions. Among individuals with detectable levels of HCV RNA, the IFN scores were greater, with significantly lower documented hepatitis C virus downregulation scores and greater HCV upregulation scores.
Among individuals with RNA- sequencing data, 105 self-documented HIV infections in the COPDGene questionnaire. Out of 30 individuals among whom HIV RNA was detected, 22 were among the 105 individuals with seld-documented HIV infections, indicative of significant viral enrichment. Among 83 individuals who reported HIV infections, HIV RNA was not detected, indicative of low sensitivity of the method, which could also be due to the use of anti-HIV therapeutics.
Among 66 individuals documenting anti-HIC medication use, 65 individuals had self-documented HIV infections, and in the one remaining individual on medication, HIC RNA was not detected. Individuals with HIV RNA in detectable limits were lower-aged African-American individuals with current smoking habits. Greater IFN scores, significantly lesser human immunodeficiency virus downregulation, and greater IFI27 expression were observed among individuals with detectable HIV RNA, indicating HIV RNA correlated with HIV infections.
Concerning temporal associations, 15 out of 22 individuals with HIV RNA in detectable limits initially documented human immunodeficiency virus infections after five years of follow-up. Hepatitis C virus ribonucleic acid was detectable among 20 individuals out of 105 individuals self-documenting human immunodeficiency virus infections and among seven (out of 39) individuals with HIV RNA in detectable limits. The increased HCV infection-HIV infection co-occurrence rates were per the presence of factors that increase the risk of both infections.
Overall, the study findings highlighted the secondary usage of peripheral blood RNA sequencing data to detect viral infections and transcriptomic signatures of host responses to HCV and HIV infections.