The goal of cancer treatment is to match the right drug to the right target in the right patient. But before such "personalized" drugs can be developed, more knowledge is needed about specific genomic alterations in cancers and their sensitivity to potential therapeutic agents.
Now an academic-industry collaboration is releasing the first results from a new and freely available resource that marries deeply detailed cancer genome data with predictors of drug response, information that could lead to refinements in cancer clinical trials and future treatments. The Cancer Cell Line Encyclopedia (CCLE), authored by scientists at the Broad Institute, Dana-Farber Cancer Institute, the Genomics Institute of the Novartis Foundation, and the Novartis Institutes for Biomedical Research, is described in the March 29 issue of the journal Nature. In a proof of principle, the researchers also report that genomic predictors of drug sensitivity revealed three novel candidate biomarkers of response.
"We hope that the Cancer Cell Line Encyclopedia will be a preclinical resource that could guide clinical trials," said Levi A. Garraway, a senior associate member of the Broad Institute, an associate professor at Dana-Farber Cancer Institute and Harvard Medical School, and a co-corresponding author of the paper.
"The CCLE is a public resource that we think will catalyze discoveries throughout the cancer research community," said Todd Golub, director of the Broad's Cancer Program, Charles A. Dana Investigator in Human Cancer Genetics at the Dana-Farber Cancer Institute, and a co-author of the paper. "With this initial effort, we have taken some critical first steps. The challenge now is to greatly expand the number of compounds tested across the panel of cell lines."
The CCLE integrates gene expression, chromosomal copy number, and massively parallel sequencing data from almost 1,000 human cancer cell lines together with pharmacological profiles for 24 anticancer drugs across roughly half of these cell lines. The scale of the project allows greater depth of genetic characterization and pharmacological annotation than previously possible with fewer cell lines. A separate effort by scientists at Massachusetts General Hospital and the Sanger Institute appears in the same issue of Nature.
To accomplish such a feat, the team of scientists relied on the genetics, computational biology, and drug-screening capabilities at the Broad, Dana-Farber, and Novartis. They chose 947 of the nearly 1,200 commercially available cancer cell lines to reflect the genomic diversity of human cancers.
"One of the strengths of the CCLE lies in the number of cell lines it surveys," said Nicolas Stransky, a computational biologist in the Cancer Program at the Broad and a co-first author of the paper. "We can focus on rare cancer subtypes and still have sufficient statistical power for analyses."
Cancer cell lines are malignant cells that have been removed from tumor tissue and cultured in the laboratory. Under controlled conditions, they can grow indefinitely. This near-immortality is an advantage for performing repeated experiments, but it can be a potential pitfall if the cells differ markedly from tumors because they lack typical surroundings. However, with relatively few exceptions, the CCLE cell lines proved to be representative genetic proxies for primary tumor subsets across multiple different cancer types.
Correlating the more than 50,000 genetic and molecular features that emerged from these cell lines created a computational challenge that the scientists met by adapting algorithms to the biological data. They tested this tool against genetic alterations known to predict sensitivity to cancer drugs, and confirmed the value of their systematic approach. Then they applied the predictive modeling methodology to genetic subtypes of cancer known to pose challenges for current treatment modalities.
For example, a variety of cancers have mutations in the NRAS gene, which activates signaling pathways important in tumor growth. Some NRAS-mutant cancers, including a subset of melanomas, may prove vulnerable to drugs that block a protein also involved in signaling, called MEK. The scope of the CCLE enabled the investigators to study approximately 40 cancer cell lines with this mutation to see if they could predict sensitivity to MEK inhibitor drugs, some of which are being studied in clinical trials.
One of the genetic features that rose to the top of their analysis was expression of the aryl hydrocarbon receptor (AHR) gene in cell lines that were highly sensitive to MEK inhibitors. This suggested that high levels of AHR may indicate higher sensitivity to MEK inhibitor drugs. Additional experiments suggested that some of these same cell lines might also depend on AHR activity, and that MEK inhibitors might simultaneously intercept AHR function in some instances.
Armed with this kind of knowledge from the CCLE, researchers may have a much clearer idea of which tumors are most likely to respond to particular drugs before using them in clinical trials, the scientists say. Patients could therefore be selected for such studies based on how likely they are to respond, given the genetic and molecular makeup of their cancers.
"Knowing that kind of information very early might help to improve the success rate of drug development, compared to a genetically 'agnostic' approach that includes any patient with advanced cancer without knowledge of a genetic profile," said Garraway.
The scientists also found new predictors of sensitivity to existing chemotherapy drugs in other cancer cell lines. Elevated levels of SLFN11 expression predicted sensitivity to topoisomerase inhibitors. Another analysis indicates that multiple myeloma may respond to IGF1 receptor inhibitors. Formal clinical studies will be required to learn if these features will hold true in patients.
"We can ask questions not only about emerging targeted therapies, but also about standard chemotherapy drugs," Garraway said. "There may be ways to identify patients who are more likely to respond to conventional chemotherapy versus those who might not. The predicted 'non-responders' may be better off trying a different regimen."
There are more volumes to be written in this encyclopedia.
"From a computational biology perspective, it's a clean, complex data set that allows many more analyses," Stransky said. "We are only scratching the surface of what can be done."
In the CCLE's next phase, analyses based on deeper sequencing, profiles of metabolic activity, and epigenetic modifications - changes in chromatin organization - will also be added.
"This is really the tip of the iceberg," Garraway said. "With these predictive modeling algorithms and with data sets of this size, their study could become an entire discipline in its own right. "