In this interview, News-Medical speaks to Professor Dana Crawford about her research efforts during the COVID-19 pandemic and why in-depth sequencing of SARS-CoV-2 is crucial in controlling outbreaks.
Please could you introduce yourself and tell us what provoked your research into the coronavirus disease 2019 (COVID-19) pandemic?
I am a Professor of Population and Quantitative Health Sciences and Associate Director of the Cleveland Institute for Computational Biology at Case Western Reserve University in Cleveland, Ohio. I also hold a secondary appointment in Genetics and Genome Sciences.
I am trained in human genetics and genomics and am a genetic epidemiologist. Like many in our field, early reports of the differences in SARS-CoV-2 infection rates and COVID-19 disease course among those infected prompted questions about the role host-pathogen genetics and genomics have on host susceptibility and severity.
Now more than a year into the pandemic, we know that non-genetic factors such as socioeconomic status have a substantial impact on susceptibility when, for example, lower-income workers cannot work remotely or work socially distanced as recommended in the US. The severity of COVID-19 infection, including hospitalization and death, is associated with several demographic variables, most notably older age. While many non-genetic factors are now associated with infection rates and COVID-19 severity, the observed variability cannot be completely explained nor can be predicted by these variables alone.
In human genetics and genomics, the genome-wide association study (GWAS) is the workhorse study design used to identify common genetic changes associated with complex human outcomes like SARS-CoV-2 infection and COVID-19 severity. Early GWAS findings published by investigators based on samples from the early 2020 outbreak in Italy and Spain demonstrated significant associations between genetic changes on chromosome 3 and COVID-19 infection. The ABO blood group has also been associated, and both these findings join a rapidly growing list of genetic associations recently described by the ongoing COVID-19 Host Genetics Initiative GWAS meta-analysis, which now includes human genetic samples from multiple countries.
I am one of many investigators involved in the Million Veteran Program (MVP)’s GWAS for SARS-CoV-2 susceptibility and COVID-19 severity. Collectively, our field is interested in identifying common and rare genetic changes in the human genome associated with the response to this emerging infectious disease to better understand the biology of infection that may lead to novel therapeutics or prevention strategies.
A complete understanding of the pandemic requires both human and viral data. As already described, several studies are ongoing to describe how human genetic variation impacts responses to infection. On the viral side, most efforts have focused on genomic surveillance to identify and track variations in SARS-CoV-2 sequences that may be of interest or concern. At the time of writing this, the Delta variant is a more infectious version of the virus compared with its predecessors, and it is making its way across the world among the unvaccinated.
As the virus continues to circulate and replicate, it mutates. Most mutations are neutral whereas others like the Delta variant create increased fitness for the virus over other versions of the virus. SARS-CoV-2 sequencing for genomic surveillance helps public health investigators track how the virus, barcoded with each of its mutations, is spreading locally. Genomic surveillance also helps sound the alarm when the viral variant demonstrates differences in infection, severity, or worse—evasion of the immune system even after the body is primed by vaccines.
While my own research has, to date, been limited to human genetic variation and the host reaction to SARS-CoV-2, I am interested in pathogen variability and host-pathogen interactions that may impact outcomes of interest. This data is very difficult to access in sufficient numbers for meaningful genetic and genomic studies. As we write in the PLoS Genetics viewpoint, variability in SARS-CoV-2 sequencing coupled with the lack of clinical data linkages to this sequencing data makes these studies nearly impossible.
SARS-CoV-2. Image Credit: Kateryna Kon/Shutterstock.com
How are SARS-CoV-2 outbreaks currently monitored?
COVID-19 outbreaks in the United States are, in part, currently monitored much like traditional infectious disease outbreaks from the early 20th century. That is, symptomatic individuals will present at the clinic and hospital where, if their signs and symptoms are consistent with COVID-19, they will be tested for SARS-CoV-2.
There are now a variety of tests available for SARS-CoV-2. Cleveland-area hospitals use the polymerase chain reaction (PCR) test, and the test is performed on biospecimens collected by nasal swabs. While the PCR test gives information as to whether the SARS-CoV-2 virus is present in the sample or not, it does not give the full sequence of the virus. To do this, the sample must be sequenced.
Whether or not the sample is sequenced depends on a variety of factors in the United States. As we describe in the PLoS Genetics viewpoint, factors such as funding and established relationships with genomics facilities can be major factors as to whether or not the samples are sequenced.
The traditional infectious disease outbreak response relies on symptomatic individuals presenting at clinics. We know now that many SARS-CoV-2 infected individuals can be asymptomatic as well as infectious, silently spreading the virus to others. Infected individuals may also present with mild symptoms that may not require clinical attention.
While individuals are encouraged to be tested for the virus if they know they have come into contact with infected individuals, this recommendation is not enforceable, and many individuals do not know they have been exposed as contact tracing in the United States has also been difficult to enforce (and modernize).
The large numbers of asymptomatic, but infectious, individuals and imperfect contact tracing make genomic surveillance ideal for SARS-CoV-2 tracking. Between spring 2020 and spring 2021, many large public and private groups in the United States established routine testing as a surveillance program, asking university students or employees, for example, to submit weekly nasal swabs or saliva samples regardless of symptoms. These efforts provided essential data for tracking the emergence of outbreaks or increases in case numbers which could then be used to justify and enact certain public health responses (e.g., closures) until the case numbers decreased.
Unfortunately, these biospecimens were not necessarily sequenced, again depending on the resources and the perceived importance of tracking variants as part of the surveillance program. Many of these surveillance programs have now been scaled back or discontinued with the decrease in US case counts and increase in vaccinations.
As of writing this, even with less routine testing, data demonstrates that case counts are on the rise again due to the Delta variant and stagnant vaccination rates. With the onset of colder weather and the fall school season, we might expect a resurgence in interest in routine SARS-CoV-2 surveillance in the US.
What have traditional surveillance methods focused on in controlling disease outbreaks and what are some of the limitations of these methods?
As I mentioned above, the traditional surveillance method for emerging infectious diseases relies on sick individuals presenting at a clinic. Now that the virus has been identified and extensively characterized, regular sampling and testing at the population level is a better approach as the data can be used to predict local and national infection trajectories that inform expected hospitalizations and deaths.
The SARS-CoV-2 PCR-based methods are based on a technique first described in the mid-1980s that has since been developed and adapted for high-throughput population-scale research and diagnostics. There are several limitations associated with the PCR-based method of virus detection, including contamination (as demonstrated early by the Centers for Disease Control and Prevention’s batch of contaminated testing kits) and variable lower limits of detection (e.g., the minimum number of viral copies in the sample required for test detection).
A major limitation for human genetics and genomics research is that the PCR-based test does not provide the sequence. Without this sequence, we will not be able to conduct host-pathogen studies effectively. From a public health standpoint, the sequence is important to identify mutations or variants of concern should the mutations make the virus more infectious or give it an ability to evade the immune system more efficiently.
There have been huge advances in genome sequencing techniques over the past ten years. How could genome sequencing be used to help us to monitor coronavirus outbreaks?
Genome sequencing is essential in this phase of the pandemic. We have large unvaccinated populations in the US and worldwide where the virus is circulating and dividing. Each division is the chance for mutation.
While most mutations are neutral changes, a fraction will be beneficial to the virus and the detriment of humans. PCR-based methods are based on known viral sequences whereas sequencing is independent of what is already known, making sequencing the ideal method for complete genomic surveillance of this evolving virus.
SARS-CoV-2 Sequencing. Image Credit: vchal/Shutterstock.com
What are the advantages of utilizing genome sequencing to monitor disease outbreaks and their evolution?
A major advantage of sequencing using today’s technology is that it is a rapid, scalable, and relatively cost-effective approach to surveillance. For those of us interested in host-pathogen research, individual-level data is preferred. That is, as a researcher I would want the human genome-wide genotyped or sequenced and the genome of the virus that infected that same human sequenced. This granular data affords researchers more options for analysis that will be important in designing the next generation of diagnostics and therapeutics, including new vaccines.
Instead of individual-level sequencing, another great option is bulk sequencing. Many communities have taken up sequencing sewage for SARS-CoV-2 to detect and predict outbreaks in their local areas.
You described the adoption of genomics in surveilling SARS-CoV-2 as ‘slow, difficult and inconsistent’. Why is this and what more can be done?
As we outline in the PLoS Genetics viewpoint, there are several factors likely to be associated with the slow and inconsistent response. Unlike the United Kingdom, a leader in SARS-CoV-2 genomic surveillance beginning early in the pandemic, the major public health agency in the US does not have a long, established relationship with the human genetics and genomics research community.
Also, public health in the US is underfunded. The confluence of these two factors alone left the US unprepared and unable to effectively respond to this new infectious disease associated with substantial asymptomatic airborne spread. These, however, are not the only factors related to the US’s poor response to the pandemic, particularly in the early phases.
The US government and its administration at the time were not supportive of a strong and unified approach, leaving each state and each community struggling to cope. This decentralized approach really highlighted the differences in resources available at the local level, explaining in part why the Seattle area was able to pivot much more quickly to genomic surveillance compared with other locales with less genomic expertise and experience.
The attention now given to the importance of viral variants has prompted increases in funding for genomic surveillance in the US. Worldwide, the challenge now is to enable the same genome surveillance opportunities in countries, or areas without the resources or expertise.
Do you believe that with improved use of genomic sequencing, SARS-CoV-2 outbreaks can be monitored more effectively, which in turn will help us to further understand the real-time evolution of SARS-CoV-2?
Yes, improved use of genomic sequencing and public sharing of this data will improve the monitoring of SARS-CoV-2 outbreaks, providing a better understanding of how the virus is evolving. A more complete understanding of the virus’s impact on the human host, however, will require data linkages between genomic data and clinical and epidemiologic data.
In the US, this data is siloed and not easily linked for research purposes. However, this data is essential to better understand the consequences of SARS-CoV-2 evolution and its influence on COVID-19 disease severity.
Disease Outbreak. Image Credit: ETAJOE/Shutterstock.com
How do you think future disease outbreaks need to be surveilled? How can collaboration efforts be utilized to achieve this?
The emergence of SARS-CoV-2/COVID-19 underscores the need to augment traditional surveillance based on individual clinical presentation with a more proactive approach leveraging genomic surveillance at the population scale.
The current sewage surveillance programs are an excellent example of efficient, population-scale genomic surveillance for infectious diseases, and these efforts can be scaled and mined bioinformatically to detect both known and unknown pathogens. Such efforts however will require consistent and constant support, not just support and interest at the height of a pandemic.
What are the next steps in your research?
My basic research interests are to understand how human genetic variation impacts human health. In the context of COVID-19, I am interested in understanding how human genetic variation impacts susceptibility to infection and severity of disease course.
I am a member of the Million Veteran Program (MVP) COVID-19 genome-wide association and phenome-wide association working groups, and as part of these groups, we are conducting studies in participating US Veterans to identify common genetic variation relevant to observed differences in susceptibility and severity of the disease.
Where can readers find more information?
About Professor Dana Crawford
Dana Crawford, Ph.D., is Professor in the Department of Population and Quantitative Health Sciences and Associate Director for Population and Diversity Research in the Cleveland Institute for Computational Biology at Case Western Reserve University (CWRU). She also has a secondary appointment in the Department of Genetics and Genome Sciences.
Dr. Crawford received her Ph.D. at Emory University in genetics and molecular biology in 2000 and then trained as a post-doctoral fellow as an Epidemic Intelligence Service Officer at the Centers for Disease Control and Prevention (2000–2002) and as a senior fellow at the University of Washington’s Department of Genome Sciences (2002–2006).
Prior to her most current position, Dr. Crawford spent eight years as tenure-track faculty in the Department of Molecular Physiology and Biophysics and Investigator in the Center for Human Genetics Research at Vanderbilt University. As a genetic epidemiologist at CWRU, Dr. Crawford’s broad research interests include applying genetic variation data to large-scale epidemiologic and clinical cohorts to better understand human genotype-phenotype associations with an emphasis on diverse populations.
Dr. Crawford has authored >175 publications in peer-reviewed literature. She is currently a board member of the American Society of Human Genetics (ASHG) and is an elected fellow with the American Association for the Advancement of Science (AAAS). The views and opinions expressed here are Dr. Crawford’s and not necessarily the views and opinions of CWRU, ASHG, AAAS, MVP, or any other organization with which she is a member.