A study published in JAMA Network Open describes the utility of multi-level machine learning models in estimating the risk of delay between cancer diagnosis and treatment initiation in a large group of cancer patients.
Study: Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment. Image Credit: Peshkova / Shutterstock.com
Background
Cancer patients with poor socioeconomical background and those living in low-resource neighborhoods often experience delays in treatment initiation after diagnosis, which significantly affects clinical outcomes.
The timely implementation of effective treatments can be achieved by identifying patients who are at an increased risk of health disparities. This must be accompanied by improvements in care coordination and patient navigation services; however, these approaches are resource intensive. Thus, a more effective approach would be identifying patients who are at a greater risk of diagnostic delays and subsequently targeting them for timely treatment.
In the current study, scientists evaluate whether machine learning models incorporating clinical and demographic data of cancer patients and neighborhood-level social determinants of health data can be used to identify patients who are at a greater risk for treatment initiation delay.
About the study
The researchers investigated the predictive efficacy of four different machine learning models, including group least absolute shrinkage and selection operator, Bayesian additive regression tree, gradient boosting, and random forest. Adult patients with breast, lung, colorectal, bladder, or kidney cancer who were diagnosed between 2013 and 2019 and subsequently treated at Fox Chase Cancer Center in Philadelphia were included in the study.
Patient data related to cancer diagnosis-first treatment interval, health and demographic characteristics including race, ethnicity, laboratory findings, and comorbidities, as well as neighborhood-level health variables, were incorporated into the machine learning models.
Based on a previous observation that a 60-day delay between diagnosis and treatment initiation can increase cancer mortality, scientists investigated whether these models can predict the likelihood of a treatment delay of more than 60 days after diagnosis.
Three factors, including discrimination, calibration, and interoperability, were applied to select the optimal machine learning model for the study analysis. This led to the selection of group least absolute shrinkage and selection operator (LASSO) as the final model.
Important observations
A total of 6,409 patients were included in the study, 14% of whom belonged to the most socioeconomically deprived neighborhoods. About 25% of the study cohort experienced a delay of more than 60 days between cancer diagnosis and treatment initiation.
The selected group LASSO model incorporating clinical, demographic, and neighborhood-level social determinants of health data was associated with high effectiveness in identifying patients who were at risk of experiencing a delay of more than 60 days between diagnosis and treatment.
The model predicted that patients were less likely to experience a delay if they were diagnosed at the treating center, had the index cancer as their first malignant neoplasm, were Asian or Pacific Islander or White, had private insurance, or had late-stage disease. In contrast, patients with certain comorbidities or increased creatinine levels were more likely to experience a delay. The model showed similar effectiveness in predicting delays for patients diagnosed internally or externally.
Regarding neighborhood-level social determinants, the model predicted that patients belonging to the most socioeconomically deprived areas were more likely to experience a delay as compared to those belonging to the least socioeconomically deprived areas. While neighborhoods with high Hispanic populations were identified as a risk factor for treatment delays, patients residing in areas with a high Black population were less likely to experience a delay.
As compared to the predictions made for the overall population, the model showed lower effectiveness in predicting delays for Black patients, other than non-Hispanic White patients, and those residing in the most deprived areas.
Study significance
Machine learning models that incorporate multi-level data sources can effectively identify cancer patients who are at a greater risk of experiencing treatment delays of more than 60 days after their initial cancer diagnosis.
Although neighborhood-level social determinants of health are incorporated in the study model as contributing variables, no significant impact of these factors was observed on the model performance. Furthermore, the model exhibits lower predictive effectiveness in vulnerable populations.
Future studies should include a higher proportion of vulnerable populations and more relevant social variables to improve the model performance.