A study published in the PLOS ONE Journal finds that people who misrepresent their identity and respond non-attentively and randomly in online surveys can completely mislead the outcomes of survey research by providing false and poor-quality information.
Study: Did people really drink bleach to prevent COVID-19? A guide for protecting survey data against problematic respondents. Image Credit: Song_about_summer/Shutterstock.com
Background
Surveys are the most common approaches for collecting data in public health, medical research, social science, and political science. However, properly scrutinizing survey data, especially self-reported data, is crucial to ensure high-quality information.
People who are inattentive and mischievous, provide false information, or systematically provide affirmative responses to questions are most likely to mislead survey objectives and alter survey outcomes. Such "problematic respondents" are highly capable of attenuating survey findings.
To mitigate this problem, scientists have developed validity screening methods that prevent problematic respondents from participating in a survey or identifying them so that the data they provided can be excluded from the analysis.
In the current study, scientists have investigated whether problematic respondents can falsify survey findings by dramatically inflating point estimates and creating false associations regarding health-related behaviors.
During the coronavirus disease 2019 (COVID-19) pandemic, a study conducted by the Centers for Disease Control and Prevention (CDC) found that Americans are adopting highly dangerous cleaning practices to combat COVID-19 infection, including ingesting household cleaning products.
In the current study, scientists have specifically investigated to what extent problematic respondents have influenced the findings of the CDC study conducted in May 2020.
Study design
This study collected two data sets using the same survey design, question-wording, online sample provider, and sampling methodology reported by the CDC study.
The first data set was collected in June 2020 from a national sample of 600 respondents. The second data set was organized in July 2020 from a national sample of 688 respondents.
The participants were asked about their cleaning practices and negative health outcomes via survey questionnaires. Validity screening methods were used to identify problematic responses within the collected dataset.
The primary aims of these methods were to assess participants' attentiveness and English language comprehension and compliance. The participants who passed these quality control measures were subjected to demographic and open-ended response verification.
Important observations
Based on the quality control analyses, the participants were categorized into two groups, i.e., "problematic respondents" and "non-problematic respondents."
About 77% of respondents in the first dataset were non-problematic, and 23% were problematic. In the second dataset, 67% were non-problematic, and 33% were problematic respondents.
The analysis of the complete dataset without quality control measures revealed that the self-reported data on cleaning and disinfection practices collected in this study is comparable to that collected by the CDC study.
The comparison of reports provided by problematic and non-problematic respondents revealed that most of the affirmative responses to dangerous cleaning practices are reported by problematic respondents. Importantly, none of the non-problematic respondents reported any dangerous cleaning behaviors.
Regarding the least common and highly dangerous cleaning practices, such as ingesting household cleaners, soapy water, or diluted bleach, the analysis indicated that problematic respondents are eight – 29 times more likely to report affirmatively than non-problematic respondents.
Regarding moderately dangerous and more common clearing practices, the analysis revealed that problematic respondents are two–three times more likely than non-problematic respondents to report affirmatively.
Open-ended response verification
The open-ended response verification method was applied to respondents who passed the first two quality control measures. This final verification method identified only one non-problematic respondent who reported intentionally ingesting a cleaning product.
Further verification of the respondent's demographic data resulted in identifying several suspicious pieces of information. Open-ended verification of the complete dataset revealed that some respondents misunderstood or misinterpreted the survey questions.
Overall, the final verification analysis indicated that none of the convincing responses indicated ingesting dangerous cleaning products to prevent COVID-19 infection. However, cogent and informative reports on less dangerous cleaning practices were identified in the surveys.
Regarding compliance bias, the analysis revealed a moderate but significant association between self-reported ingestion of cleaning products and other implausible behaviors.
Regarding health outcomes, the analysis revealed that the association between dangerous cleaning practices and negative health outcomes is significantly more pronounced among problematic respondents than among non-problematic respondents.
Study significance
The study indicates that high-risk cleaning practices documented during the COVID-19 pandemic are mostly driven by problematic respondents who provide false and inaccurate survey information.
Notably, the study highlights the need for rigorous validation of survey-derived data as problematic respondents constantly pose a fundamental challenge to all survey research.