Researchers harness AI and online data from Google and Twitter to track and predict seasonal allergy patterns, offering new insights into allergy timing and regional variations across the U.S.
Study: Internet-based surveillance to track trends in seasonal allergies across the United States. Image Credit: PeopleImages.com - Yuri A/Shutterstock.com
Over 25% of American adults suffer from seasonal allergies, yet their precise occurrence patterns remain unclear. A recent study in PNAS Nexus explored this.
Introduction
Allergies, causing symptoms like itchy skin, runny noses, watery eyes, and asthma, cost the US an estimated $4.5-40 billion annually in healthcare, lost productivity, and reduced quality of life. While most cases don’t require hospital visits, their true prevalence is hard to gauge.
Current methods to assess seasonal allergies rely on self-reports or assumptions linking allergy prevalence to aeroallergen concentration. However, aeroallergen data are limited in scope, and often focus solely on pollen levels.
Internet-based surveillance tools like Twitter, Google, Instagram, Yelp, and Facebook are common in tracking disease trends. Yet, earlier attempts (e.g., Google Flu Trends) fell short, failing to forecast influenza hospitalizations accurately. Still, these tools hold potential and continue to be refined.
About this study
The study introduces a validated, Internet-based method to track seasonal allergies across the US. The researchers used artificial intelligence (AI) and machine learning (ML) to analyze allergy-related Google searches and Twitter posts, assuming allergy symptoms would drive relevant online activity. They hypothesized that these patterns would mirror allergy-related emergency department (ED) visits in high-population California counties, where data would be dense enough for analysis.
Findings: internet data as a proxy for aeroallergen exposure
The results confirmed that "Internet-derived data can act as a proxy for aeroallergen exposure." Allergy-related searches and Twitter posts were strongly linked with ED visit data, suggesting an external factor (likely airborne allergens like mold and pollen spores) driving this relationship.
Short-term correlations in allergy data
Short-term correlations were observed across all three data sources, lending support to the idea that ED visits, searches, and posts are interlinked. However, some population biases may limit predictive reliability.
National-level modeling
Using data from California, the researchers mapped allergy-related online activity across 144 highly populated US counties, tracking fluctuations daily for eight years. Seasonal trends varied by location: most areas peaked in spring (March-May) and had a secondary fall peak (September-October).
Additional allergy seasons were noted in regions like Texas and Florida during winter and summer.
Seasonal allergy timing differed across counties; for example, Northern California’s spring peak occurred earlier than in the Bay Area. Generally, allergy peaks began in the Southeast and moved northward, reaching the Northeast and Upper Midwest last.
Future directions
The researchers suggest integrating land-use and climate data with Internet-derived allergy data to understand specific allergen trends better.
Real-time airborne allergen tracking combined with social media activity could enhance allergy prediction and response.
Conclusions
The study shows that Internet-derived data can complement traditional surveillance in predicting seasonal allergy prevalence.
By providing a fine-grained view of allergy timing and location, this approach can improve allergy predictions, especially as global ecosystem changes alter allergy patterns.