AI model Virchow outperforms clinical methods in cancer detection

In a recent study published in the journal Nature Medicine, a large team of researchers in the United States discussed the use of the foundational model Virchow for computational analysis of pathological reports and demonstrated its use in histopathological analysis to predict biomarkers and identify cells across seven rare and nine common forms of cancer.

The training dataset, training algorithm and application of Virchow, a foundation model for computational pathology. a, The training data can be described in terms of patients, cases, specimens, blocks or slides, as shown. b–d, The slide distribution as a function of cancer status (b), surgery (c) and tissue type (d). e, The dataflow during training requires processing the slide into tiles, which are then cropped into global and local views. f, Schematic of applications of the foundation model using an aggregator model to predict attributes at the slide level. GI, gastrointestinal. Study: A foundation model for clinical-grade computational pathology and rare cancers detectionThe training dataset, training algorithm and application of Virchow, a foundation model for computational pathology. a, The training data can be described in terms of patients, cases, specimens, blocks or slides, as shown. bd, The slide distribution as a function of cancer status (b), surgery (c) and tissue type (d). e, The dataflow during training requires processing the slide into tiles, which are then cropped into global and local views. f, Schematic of applications of the foundation model using an aggregator model to predict attributes at the slide level. GI, gastrointestinal. Study: A foundation model for clinical-grade computational pathology and rare cancers detection

Background

The diagnosis of cancers has traditionally depended on the examination of histopathological preparations of hematoxylin and eosin slides using light microscopes. The advances in digital technology and computational pathology have replaced this with whole-slide images that can be examined computationally, making this form of diagnosis a part of routine clinical practice.

The use of artificial intelligence (AI) in the diagnosis and characterization of cancers using digitized whole-slide images has also grown substantially, with initial efforts focused on improving workflows. However, recent studies have explored a subfield where AI is used extensively to analyze whole-slide images to reveal more than just diagnostic information, including therapeutic responses and prognosis. This is also decreasing the reliance on genomic testing and immunohistochemistry-based methods for cancer diagnosis.

About the study

In the present study, the researchers discussed the largest foundational model developed to date, Virchow. They demonstrated its use in predicting cancer biomarkers across a wide range of common and rare cancers.

Foundational models are large-scale neural networks trained on very large datasets using self-supervised learning. These models create data representations known as embeddings, which can gather generalized data from large datasets to be applied in cases with inadequate data and for predictive tasks such as determining clinical outcomes, genomic changes, and therapeutic responses.

An efficient foundational model can capture broad-spectrum patterns such as tissue architecture, nuclear morphology, cellular morphology, necrosis, staining patterns, neovascularization, inflammation, and expression of biomarkers that can be used to predict various whole-slide image characteristics.

Here, the researchers discussed Virchow, the largest foundational model developed to date, named after the pioneering modern pathologist Rudolf Virchow. The model has been trained on a sizable dataset of nearly 1.5 million hematoxylin and eosin whole-slide images obtained from a hundred thousand patients registered at the Memorial Sloan Kettering Cancer Center (MSKCC). The dataset consists of benign and malignant tissue samples obtained from resections and biopsies of 17 tissue types.

Virchow is a vision transformer model comprising 632 million parameters. It is trained using a self-supervised algorithm that uses local and global regions of the tissue tiles to create embeddings of whole-slide images that can be used for predictive tasks.

To highlight the clinical applications and utility of such a large foundational model, the researchers used the Virchow embeddings created from the large dataset of whole-slide images to train a pan-cancer model and assess its performance in predicting common and rare forms of cancer at the specimen level across various tissues.

The study compared the performance of the Virchow embeddings against Phikon, UNI, and CTransPath embeddings and evaluated the utility of Virchow embeddings along two categories. The first was the performance of the pan-cancer detection model trained using Virchow embeddings on a test dataset consisting of a mix of datasets from MSKCC and external sources spanning seven rare and nine common types of cancers. The effectiveness of Virchow embeddings in biomarker prediction using data from cancers such as lung, bladder, breast, and colon cancers was also evaluated.

Results

The study showed that Virchow embeddings demonstrated the two-fold value of a foundational model of pathology by being generalizable and providing training data efficiency. The pan-cancer model trained on Virchow embeddings was able to detect not only the common forms of cancer but also the rarer histological subtypes in the test dataset.

Furthermore, the performance of the pan-cancer model was comparable to that of clinical-grade cancer-detection models and, in cases of some rare cancers, even exceeded that of the clinical models despite having been trained on datasets with fewer tissue-specific labels.

The researchers stated that the model's performance level was especially notable considering that the dataset used to train the pan-cancer model did not undergo the subpopulation enrichment and quality control performed for training commercially and clinically used AI models.

a,b, Performance as measured by AUC of three clinical products compared to the pan-cancer model trained on Virchow embeddings, on the rare variant (a) and product testing datasets (b). The pan-cancer detector, trained on Virchow foundation model embeddings, achieves similar performance to clinical-grade products in general and outperforms them on rare variants of cancers. c, The pan-cancer detector was trained on fewer labeled specimens than the Prostate, Breast and BLN clinical models, including a small fraction of the prostate (teal), breast (blue) and BLN (yellow) tissue specimens that these clinical models were respectively trained on. d, A categorization of failure models of the pan-cancer model and four canonical examples of the primary types of failures. In all panels, * is used to indicate pairwise statistical significance (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; pairwise DeLong’s test). Error bars denote the two-sided 95% confidence interval, estimated with DeLong’s method. C., carcinoma. Inv., invasive.a,b, Performance as measured by AUC of three clinical products compared to the pan-cancer model trained on Virchow embeddings, on the rare variant (a) and product testing datasets (b). The pan-cancer detector, trained on Virchow foundation model embeddings, achieves similar performance to clinical-grade products in general and outperforms them on rare variants of cancers. c, The pan-cancer detector was trained on fewer labeled specimens than the Prostate, Breast and BLN clinical models, including a small fraction of the prostate (teal), breast (blue) and BLN (yellow) tissue specimens that these clinical models were respectively trained on. d, A categorization of failure models of the pan-cancer model and four canonical examples of the primary types of failures. In all panels, * is used to indicate pairwise statistical significance (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; pairwise DeLong’s test). Error bars denote the two-sided 95% confidence interval, estimated with DeLong’s method. C., carcinoma. Inv., invasive.

Conclusions

To conclude, the study showed that a pan-cancer model trained using Virchow embeddings was able to perform comparably and often better than clinical-grade models in detecting common and rare forms of cancer despite being trained on datasets with fewer tissue labels.

Overall, the findings highlighted the significance and utility of foundational models such as Virchow in applications involving limited amounts of training data, providing the basis for various clinical models in cancer pathology.

Journal reference:
  • Vorontsov, E., Bozkurt, A., Casson, A., Shaikovski, G., Zelechowski, M., Severson, K., Zimmermann, E., Hall, J., Tenenholtz, N., Fusi, N., Yang, E., Mathieu, P., Eck, van, Lee, D., Viret, J., Robert, E., Wang, Y. K., Kunz, J. D., Matthew, L., & Bernhard, J. H. (2024). A foundation model for clinical-grade computational pathology and rare cancers detection. Nature Medicine. DOI:10.1038/s41591024031410,  https://www.nature.com/articles/s41591-024-03141-0
Dr. Chinta Sidharthan

Written by

Dr. Chinta Sidharthan

Chinta Sidharthan is a writer based in Bangalore, India. Her academic background is in evolutionary biology and genetics, and she has extensive experience in scientific research, teaching, science writing, and herpetology. Chinta holds a Ph.D. in evolutionary biology from the Indian Institute of Science and is passionate about science education, writing, animals, wildlife, and conservation. For her doctoral research, she explored the origins and diversification of blindsnakes in India, as a part of which she did extensive fieldwork in the jungles of southern India. She has received the Canadian Governor General’s bronze medal and Bangalore University gold medal for academic excellence and published her research in high-impact journals.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Sidharthan, Chinta. (2024, July 23). AI model Virchow outperforms clinical methods in cancer detection. News-Medical. Retrieved on October 31, 2024 from https://www.news-medical.net/news/20240723/AI-model-Virchow-outperforms-clinical-methods-in-cancer-detection.aspx.

  • MLA

    Sidharthan, Chinta. "AI model Virchow outperforms clinical methods in cancer detection". News-Medical. 31 October 2024. <https://www.news-medical.net/news/20240723/AI-model-Virchow-outperforms-clinical-methods-in-cancer-detection.aspx>.

  • Chicago

    Sidharthan, Chinta. "AI model Virchow outperforms clinical methods in cancer detection". News-Medical. https://www.news-medical.net/news/20240723/AI-model-Virchow-outperforms-clinical-methods-in-cancer-detection.aspx. (accessed October 31, 2024).

  • Harvard

    Sidharthan, Chinta. 2024. AI model Virchow outperforms clinical methods in cancer detection. News-Medical, viewed 31 October 2024, https://www.news-medical.net/news/20240723/AI-model-Virchow-outperforms-clinical-methods-in-cancer-detection.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Study identifies six cancer susceptibility genes