Specialized LLMs can outperform traditional methods in forecasting postoperative risks

Download PDF Copy

Reviewed

Washington University in St. LouisMar 4 2025

Millions of Americans undergo surgery each year. After surgery, preventing complications like pneumonia, blood clots and infections can be the difference between a successful recovery and a prolonged, painful hospital stay – or worse. More than 10% of surgical patients experience such complications, which can lead to longer stays in the intensive care unit (ICU), higher mortality rates and increased health care costs. Early identification of at-risk patients is crucial, but predicting these risks accurately remains a challenge.

New advancements in artificial intelligence (AI), particularly large language models (LLMs), now offer a promising solution. A recent study led by Chenyang Lu, the Fullgraf Professor in computer science & engineering in the McKelvey School of Engineering and director of the AI for Health Institute (AIHealth) at Washington University in St. Louis, explores the potential of LLMs to predict postoperative complications by analyzing preoperative assessments and clinical notes. The work, published online Feb. 11 in npj Digital Medicine, shows that specialized LLMs can significantly outperform traditional machine learning methods in forecasting postoperative risks.

Surgery carries significant risks and costs, yet clinical notes hold a wealth of valuable insights from the surgical team. Our large language model, tailored specifically for surgical notes, enables early and accurate prediction of postoperative complications. By identifying risks proactively, clinicians can intervene sooner, improving patient safety and outcomes."

Chenyang Lu, the Fullgraf Professor in computer science & engineering in the McKelvey School of Engineering and director of the AI for Health Institute (AIHealth) at Washington University in St. Louis

Traditional risk prediction models have primarily relied on structured data, such as lab test results, patient demographics, and surgical details like procedure duration or the surgeon's experience. While this information is undoubtedly valuable, it often lacks the nuance of a patient's unique clinical narrative, which is captured in the detailed text of clinical notes. These notes contain personalized accounts of the patient's medical history, current condition, and other factors that influence the likelihood of complications.

Lu and co-first authors Charles Alba and Bing Xue, both graduate students working with Lu at the time the study was conducted, employed specialized LLMs trained on publicly available medical literature and electronic health records. They then fine-tuned the pretrained model on surgical notes to make better predictions about surgical outcomes. The resulting method – the first of its kind to process surgical notes and use them to make predictions about postoperative outcomes – can go beyond structured data to recognize patterns in the patient's condition that might otherwise be overlooked.

Based on nearly 85,000 surgical notes and associated patient outcomes from an academic medical center in the Midwest collected between 2018 and 2021, the team reported that their model performed far better than traditional methods in predicting complications. For every 100 patients who experienced a postoperative complication, the team's new model correctly predicted 39 more patients who had complications than traditional natural language processing models.

Beyond the number of patients who could potentially have surgical complications caught early and mitigated, the study also showcases the power of foundation AI models, which are designed to multitask and can be applied to a wide range of problems.

"Foundation models can be diversified, so they're generally more useful than specialized models. In this case, where lots of complications are possible, the model needs to be versatile enough to predict many different outcomes," said Alba, who is also a graduate student in WashU's Division of Computational & Data Sciences. "We fine-tuned our model for multiple tasks at same time and found that it predicts complications more accurately than models trained specifically to detect individual complications. This makes sense because complications are often correlated, so a unified foundational model benefits from shared knowledge about different outcomes and doesn't have to be painstakingly tuned for each one."

"This versatile model has the potential to be deployed across various clinical settings to predict a wide range of complications," said Joanna Abraham, associate professor of anesthesiology at WashU Medicine and a member of the Institute for Informatics (I2) at WashU Medicine. "By identifying risks early, it could become an invaluable tool for clinicians, enabling them to take proactive measures and tailor interventions to improve patient outcomes."

Source:

Washington University in St. Louis

Journal reference:

Alba, C., et al. (2025). The foundational capabilities of large language models in predicting postoperative risks using clinical notes. npj Digital Medicine. doi.org/10.1038/s41746-025-01489-2.

Posted in: Device / Technology News | Medical Procedure News | Healthcare News