In a recent study published in Cancer Discovery, researchers analyzed data from the latest public release of the genomics evidence neoplasia information exchange (GENIE) project of the American Association for Cancer Research (AACR).
Background
AACR Project GENIE is an open-source, international pan-cancer repository of real-world genomic/clinical oncology data. Founded in late 2015, the GENIE project has released nine datasets, with the latest 9.1 release encompassing data from over 110,000 tumors from more than 100,000 cancer patients. Lung, colorectal, and breast cancers are each represented by more than 10,000 tumors in the repository.
Notably, a broader research community is utilizing the project data, with nearly 624 manuscripts citing the registry as of April 2022. Studies using GENIE data fall broadly into three classes: updated prevalence, external validation studies, and hypotheses. The GENIE data has been used as a resource to classify somatic variants in clinical labs and guide the interpretation of cancer genomes.
The study and findings
In the present study, researchers analyzed data from more than 110,000 tumors. The project has grown from over 18,800 samples in the first release to 110,704 in the latest release. In the 9.1 release, more than half (57%) of the specimens were primary tumors, 32% were metastases, and the remaining (11%) were local recurrences, hematologic malignancies, or unknown. Non-small cell lung cancer, colorectal cancer, breast cancer, glioma, pancreatic cancer, and melanoma accounted for the top 50% of cases.
Around 72% of the cohort were of White ancestry, 6% were Black, 5% were Asians, < 3% of the cases were Native American, Pacific Islander, and other races, and 14% were unknown. An iterative quality assurance process has been developed since the project’s inception and refined with each release leading to the development of standardized test assay definitions and quality dashboards for feedback to the contributing centers. This has resulted in the development of 91 standardized assay definitions and associated quality dashboards as of the latest release.
To demonstrate the utility of GENIE in the clinical trial space, the authors matched all patients of the GENIE cohort to 34 – 37 sub-studies of the National Cancer Institute–Molecular Analysis for Therapy Choice (NCI-MATCH) trial according to genomic and clinical data using MatchMiner. 26% of the GENIE patients were matched to at least one sub-study. The comparison of the overall eligibility rate per sub-study between NCI-MATCH and GENIE results was similar, supporting that GENIE could be utilized for estimating real-world trial enrolments.
The authors mapped mutations to variant interpretations from OncoKB, an oncology knowledge base, to compute the frequency of clinically actionable alterations in the current dataset of GENIE. They noted that tumors with level 1 or 2 alterations (corresponding to biomarker-specific or standard care therapies) increased to 17%, more than a two-fold increase relative to the previous estimates from 2017. The frequency of level 3A alterations that correlate with promising investigational therapies for a particular type of tumor dropped marginally to 4.7%.
Overall, the authors found that 38.3% of the cases harbored at least one actionable therapeutic alteration. Besides, the research team also determined the frequency of alterations associated with resistance to targeted therapies. They mapped alterations associated with disease context-specific therapeutic resistance from OncoKB and curated an additional list of modifications with growing evidence of clinical resistance from the Catalogue of Somatic Mutations in Cancer (COSMIC) database. They identified high proportions of (resistance) alterations in colorectal cancer and gastrointestinal stromal tumors.
Next, the researchers conducted a mutational analysis of tumors with < 50 samples assigned to a set of child nodes related to one ancestor or a terminal OncoTree classification node. This led to identifying 399 unique OncoTree codes spanning 32 types of tissues from 5522 tumor samples representing 2% of the dataset.
They applied a 20/20+ algorithm that identifies tumor suppressor genes and oncogenes. Around 171 putative driver genes associated with 29 cancer types were identified with this approach. Besides, they also identified sets of driver mutations unique to subsets of rare tumors.
About 19% of samples included in this investigation had only non-driver mutations or no identified mutations. This indicated that one in five patients might benefit from a more comprehensive approach like genome and transcriptome sequencing for insights into the molecular landscapes of the tumors beyond those captured by current panels and to fuel novel precision medicine approaches.
Conclusions
In summary, AACR project GENIE represents a significant resource to link cancer genotypes with treatment outcomes. The project’s growth has been driven by the increased participation of cancer centers in the United States, the United Kingdom, the Netherlands, Spain, and France. Although the repository predominantly contains targeted gene sequencing panels applied to solid tumor specimens, there are plans to expand the current approaches. These include immune profiling strategies, cell-free DNA sequencing, and genome and transcriptome sequencing.