Researchers in Israel have demonstrated that Wikipedia's coverage of the coronavirus disease 2019 (COVID-19) pandemic continued to be based on respected media and academic sources, despite a surge in coverage as the pandemic grew during the first wave.
The team's analysis found that Wikipedia's articles continued to reference trusted media sources and high-quality academic research.
"Our study offers an in-depth analysis of the scientific backbone supporting Wikipedia's COVID-19 articles," writes the team from Tel Aviv University and the Weizmann Institute of Science in Rehovot.
The study also revealed how pre-existing articles on key topics related to COVID-19 created a framework or "scientific infrastructure" that helped provide context and regulate the influx of new information.
"It also sheds light on how Wikipedia successfully fended off disinformation on COVID-19 and may provide insight into how its unique model may be deployed in other contexts," says Jonathan Sobel and colleagues.
A pre-print version of the research paper is available on the bioRxiv* server, while the article undergoes peer review.
This news article was a review of a preliminary scientific report that had not undergone peer-review at the time of publication. Since its initial publication, the scientific report has now been peer reviewed and accepted for publication in a Scientific Journal. Links to the preliminary and peer-reviewed reports are available in the Sources section at the bottom of this article. View Sources
People flocked to Wikipedia in their millions for COVID-19 information
As host to more than 130,000 articles relating to health and medicine, Wikipedia has been a prominent source of COVID-19 information for millions of people across the world since the pandemic began in late December 2020.
Studies into the readership and editorship of health articles have shown that medical professionals are active readers of Wikipedia and comprise approximately half of those involved in editing these articles.
Furthermore, research analyzing the quality and scope of medical content has deemed Wikipedia "a key tool for global public health promotion," says Sobel and colleagues.
"With the WHO labeling the COVID-19 pandemic an 'infodemic,' and disinformation potentially affecting public health, a closer examination of Wikipedia and its references during the pandemic is merited," writes the team.
While some studies have investigated the citations used in Wikipedia articles and others have shown that the platform provides a representative sample of COVID-19 research, the authors say that to their knowledge, no research has yet focused on the role of popular media and academic sources referenced during the pandemic.
What did the current study involve?
Using references as a readout, Sobel and colleagues analyzed which sources informed Wikipedia's growing coverage of COVID-19 during the pandemic's first wave – between January and May 2020.
They found that Wikipedia's coronavirus-related articles mostly referenced high-quality sources, both from general and academic literature.
Articles tended to cite trusted news outlets and academic journals with high impact factors.
Wikipedia COVID-19 Corpus of scientific sources reveals a greater fraction of open-access papers as well as a higher impact in Altmetric score. A) Bar plot of the most trusted academic sources. Top journals are highlighted in green and preprints are represented in red. Bottom right: boxplot of the distribution Altmetrics score in Wikipedia COVID-19 corpus - the dump from May 2020, the COVID-19 Corpus and the scientific sources from the Europmc COVID-19 search. B) Fraction of open-access sources, C) fraction of preprints from BioRxiv and MedRxiv.
Despite a surge in pre-print articles promising cutting-edge findings, Wikipedia tended to make use of peer-reviewed, open-access studies published in high-impact-factor journals over un-reviewed studies uploaded independently on pre-print servers.
However, a temporal analysis of the growth in COVID-19 content and latency analysis of articles' citations revealed that while high academic standards were generally maintained after the pandemic broke, there was some compromise on quality.
While the overall number of academic references in certain articles decreased, references to popular media increased.
"Most of Wikipedia's COVID-19 content was supported by references from highly trusted sources - but more from the general media than from academic publications," writes Sobel and colleagues.
This probably reflected efforts to remain up to date, says the team.
Wikipedia COVID-19 corpus article-scientific papers (DOI) network. The network mapping scientific papers cited in more than one article in the Wikipedia COVID-19 corpus was constructed using each DOI connecting at least two Wikipedia articles. This network is composed of 454 edges, 179 DOIs (Blue) and 136 Wikipedia articles (Yellow). A zoom in on the cluster of Wikipedia articles dealing with COVID-19 drug development is depicted with edges in red connecting the DOIs cited directly in the article and edges in blue connecting these DOIs to closely related articles citing the same DOIs.
Contextualizing the science and fending off disinformation
This more detailed analysis also revealed how pre-existing articles played a crucial role in contextualizing the science underlying many popular concepts. This pre-existing content served as a framework or "scientific infrastructure" that helped regulate the influx of new information and place it within Wikipedia's existing network of knowledge.
This infrastructure, which included past articles about key topics related to the virus and information on organizational practices such as strict sourcing policies, played an essential role in fending off disinformation and ensuring high standards were maintained.
The researchers say another key to Wikipedia's success in this respect is the centralized oversight mechanisms that exist as a result of the community of editors that can be quickly and efficiently deployed.
"In this case, the existence of the WikiProject Medicine, and the formation of a specific COVID-19 task force in the form of WikiProject COVID-19, helped safeguard quality across large swaths of articles and enforce a relatively unified sourcing policy on articles dealing with both popular and scientific aspects of the virus," explains the team.
What do the authors conclude?
The researchers say the findings outline ways in which Wikipedia managed to fight off disinformation and stay up to date during the first wave of the COVID-19 pandemic.
"With Facebook and other social media giants struggling to implement both technical and human-driven solutions to disinformation from the top down, it seems Wikipedia dual usage of established science and a community of volunteers, provides a possible model for how this can be achieved - a valuable task during an infodemic," concludes Sobel and colleagues.
This news article was a review of a preliminary scientific report that had not undergone peer-review at the time of publication. Since its initial publication, the scientific report has now been peer reviewed and accepted for publication in a Scientific Journal. Links to the preliminary and peer-reviewed reports are available in the Sources section at the bottom of this article. View Sources
Article Revisions
- Apr 5 2023 - The preprint preliminary research paper that this article was based upon was accepted for publication in a peer-reviewed Scientific Journal. This article was edited accordingly to include a link to the final peer-reviewed paper, now shown in the sources section.