Scientists launch public resource to classify cancer subtypes for better diagnosis

A multi-institutional team of scientists has developed a free, publicly accessible resource to aid in classification of patient tumor samples based on distinct molecular features identified by The Cancer Genome Atlas (TCGA) Network.

The resource comprises classifier models that can accelerate the design of cancer subtype-specific test kits for use in clinical trials and cancer diagnosis. This is an important advance because tumors belonging to different subtypes may vary in their response to cancer therapies.

The resource is the first of its kind to bridge the gap between TCGA's immense data library and clinical implementation.

A paper describing the tools published online today in Cancer Cell.

TCGA defined molecular subtypes for each major type of cancer. With this resource, we aimed to provide the clinical and scientific communities with the tools to assign a newly diagnosed tumor to one of these established subtypes. Our new resource will be a powerful asset for creating clinical assays based on the diverse molecular variations between cancers."

Peter W. Laird, Ph.D., the Peter and Emajean Cook Endowed Chair in Epigenetics at Van Andel Institute and study's lead corresponding author

TCGA was a decade-long, National Cancer Institute-led effort to create detailed molecular maps of 33 cancer types. Unlike traditional approaches that define cancers based on the organ or tissue in which they arise, TCGA identified nuanced genomic, epigenomic, proteomic and transcriptomic characteristics that more precisely describe cancer subtypes.

Andrew D. Cherniack, Ph.D., of the Broad Institute of MIT and Harvard and Kyle Ellrott, Ph.D., of the Knight Cancer Institute at Oregon Health & Science University also are corresponding authors of the paper, which represents a collaborative effort between scientists from more than a dozen research organizations.

"Since many TCGA molecular subtypes were generated using hundreds or thousands of features from multiple data types, scientists and physicians have asked us for help subtyping their samples," Cherniack said. "Our resource greatly simplifies this process."

The team created the new resource by leveraging data from 8,791 TCGA cancer samples that represented 26 cancer cohorts and 106 cancer subtypes. They then used existing machine learning tools to develop and test nearly half a million models across six categories -; gene expression, DNA methylation, miRNA, copy number, mutation calls and multi-omics -; and selected those that performed best for inclusion in the online resource.

In total, the resource contains 737 ready-to-use models, which represent the top models from each of the 26 cancer cohorts, the five training algorithms and six data types.

"A major element of this effort was working to ensure that these models could be deployed by other groups onto new datasets," Ellrott said. "All too often this type of work is difficult to replicate or apply to new samples."

The resource may be accessed at https://github.com/NCICCGPO/gdan-tmp-models.

Co-first authors of the study include Christopher K. Wong of University of California, Santa Cruz, Christina Yau of University of California, San Francisco, and Buck Institute for Research on Aging, Mauro A. A. Castro of the Federal University of Paraná, Jordan E. Lee of Oregon Health and Science University, Brian J. Karlberg of Oregon Health and Science University, Jasleen K. Grewal of BC Cancer, Vincenzo Lagani of JADBio Gnosis DA and Ilia State University, and Bahar Tercan of the Institute for Systems Biology.

Other authors include Verena Friedl, Vladislav Uzunangelov and Joshua M. Stewart of University of California, Santa Cruz; Toshinori Hinoue of Van Andel Institute; Lindsay Westlake and Xavier Loinaz of the Broad Institute of MIT and Harvard; Ina Felau, Peggy I. Wang, Anab Kemal, Samantha J. Cesar-Johnson and Jean C. Zenklusen of the National Cancer Institute; Ilya Shmulevich of the Institute for Systems Biology; Alexander J. Lazar of the University of Texas MD Anderson Cancer Center; Ioannis Tsamardinos of JADBio Gnosis DA and University of Crete; Katherine A. Hoadley of Lineberger Comprehensive Cancer Center at University of North Carolina at Chapel Hill; The Cancer Genome Atlas Analysis Network; A. Gordon Robertson of BC Cancer; Theo A. Knijnenburg of the Institute for Systems Biology; and Christopher C. Benz of Buck Institute for Research on Aging.

Source:
Journal reference:

Ellrott, K., et al. (2025) Classification of non-TCGA cancer samples to TCGA molecular subtypes using compact feature sets. Cancer Cell. doi.org/10.1016/j.ccell.2024.12.002.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Researchers grow tumor organoids from blood to tackle breast cancer metastasis