In a recent study published in JAMA Pediatrics, researchers developed and validated an automated classifier for diagnosing acute otitis media (AOM) in children.
Study: Development and Validation of an Automated Classifier to Diagnose Acute Otitis Media in Children. Image Credit: Image Point Fr/Shutterstock.com
Background
AOM is the second most common illness in children in the United States (US). Despite the high prevalence, the diagnostic accuracy of AOM has been consistently ≤ 75%.
Methods to improve accuracy and facilitate diagnosis have evolved over the years. Recent efforts to improve diagnostic accuracy have focused on artificial intelligence (AI).
Several studies have leveraged deep learning for training neural networks to detect AOM and other ear-related conditions, albeit with limited clinical application.
About the study
In the present study, researchers developed and validated an AI decision support tool for interpreting tympanic membrane (TM) videos and improving AOM diagnosis.
First, a medical-grade mobile application or app was designed to capture TM videos; users could adjust brightness and focus to capture the best image. The app was also embedded with voice recognition software to enable controls through voice commands.
As an optional feature, users could record their impressions (of TM) and presumptive diagnosis. Next, a training library was developed using otoscopic assessments of children presenting for wellness or sickness visits. Convenience sampling was applied to select children.
An endoscope or otoscope connected to the smartphone was used to capture videos of children’s TMs. Two otoscopists reviewed the videos and assigned a diagnosis.
The team administered a survey of parents of children whose examination included the use of the AI classifier.
A deep residual (DR)-recurrent neural network (RNN) was trained using TM videos as input and expert-assigned diagnosis as the reference. Model output was TM features and AOM diagnosis. Around 80% of videos were used for training and 20% for testing.
The DR-RNN model generated the probability of AOM for each video, and AOM was diagnosed if the probability was ≥ 50%. The Youden index, viz., the difference between true and false positive rates, was estimated at different probability thresholds to validate the choice of the threshold.
Additionally, a decision tree (DT) model was developed as an alternative to examine if the results would be different; this used DR-RNN model-predicted TM features.
The team compared different frame extraction methods: diversity maximization, blurriness minimization, sharpness maximization, equal width sampling, and contrast maximization.
In addition, an image quality classifier was trained and tested to prompt users that the videos captured may be sub-optimal for diagnostic purposes.
The researchers compared the output generated by both models with expert-assigned diagnosis and computed sensitivity, specificity, and positive and negative predictive values.
A receiver operating characteristic (ROC) curve was generated for the DR-RNN model by plotting true and false positive results at different probability thresholds. ROC was not plotted for the DT model as it was not probabilistic.
Findings
Overall, 1,151 videos were selected from 635 children, predominantly younger than three years. Experts assigned 305 videos as AOM and the remainder as not AOM.
Sixty parent questionnaires were obtained; results were favorable, with 80% of parents urging the reuse of the classifier in future visits.
The accuracy of DT and DR-RNN models was almost identical. The sensitivity and specificity of the DR-RNN model were 93.8% and 93.5%, respectively.
The corresponding figures for the DT model were 93.7% and 93.3% respectively. For the DR-RNN model, the area under the ROC was 0.973.
Diversity maximization yielded the most accurate results for frame selection. Clips shorter than two seconds were difficult to classify compared to longer clips. The exclusion of low-resolution clips did not improve model output. The average prediction time was 4.6 seconds.
The maximum Youden value was 0.88 at the 42% threshold, almost equivalent to that (0.876) at 50%. Among model-generated TM features, TM bulging was closely aligned with the predicted diagnosis.
Bulging was detected in all 230 cases predicted to be AOM. The sensitivity and specificity of the image quality filter were 92.3% and 78.3%, respectively.
Conclusions
In sum, the researchers generated an AI algorithm to classify videos of TM into AOM or no AOM categories. The classifier was more accurate than primary care physicians, pediatricians, and advanced practice clinicians.
As such, it could be used to aid in treatment-related decisions. Overall, the findings suggest that this AI decision support tool could improve AOM diagnostic accuracy in children.
Moreover, TM videos could be used for enhanced otoscopic examination, discussions with colleagues or parents, and documentation in health records.