Researchers have probed deep into the cell's genome, beyond the basic genetic code, to begin learning the "grammar" that helps determine whether or not a gene gets switched on to make the protein it encodes.
Their discovery -- that the ordering of specific DNA sequences in key regions of the genome affects the activity of genes -- might advance efforts to use gene and cell-based therapies to treat disease, said UCSF molecular biologist Nadav Ahituv, PhD, senior scientist on the study. The findings were published online in the journal Nature Genetics on July 28 and will appear in the September print edition.
In gene therapy, which is still experimental, specific genes are delivered to cells to make proteins that improve cellular physiology and fight disease. The new findings offer a way to activate these genes in specific tissues.
"Our work suggests a framework for the design of synthetic, tissue-specific DNA that could be used to control gene activation," said Ahituv, an associate professor in the UCSF School of Pharmacy.
An individual's genes are essentially the same in every cell. However, different combinations of genes are either silent or actively making protein in different cells. These patterns of gene activation make the lips differ from the liver, for instance, and determine whether the liver is functioning normally or not.
In their new study, Ahituv and colleagues made significant progress in understanding the integration of information and decision-making that goes on within the DNA regions that guide this gene activation.
The researchers determined that key bits of DNA, called "enhancers," which serve as a type of gene regulator, do not operate in an all-or-nothing manner to control whether or not genes are active. Instead, the researchers found that changes in the arrangements of specific DNA sequences within these enhancers result in changes in levels of gene activity, similar to the way changing the syntax of a sentence affects its meaning.
Enhancers, when bound by proteins called transcription factors, play a necessary role in activating specific genes that may be quite a distance away within the cell's chromosomes. The arrangement of DNA sequences in the enhancers determines the likelihood that matching transcription factors found in specific cell types will attach and cause the activation of genes, the scientists discovered.
The findings point to a strategy for designing DNA enhancers that might optimally guide gene activity in specific tissues targeted for gene therapy. Similar strategies might be used to help guide the development of cell therapies from stem cells for use in regenerative medicine to replace damaged tissue, according to Ahituv.
Like more than 98 percent of DNA in the human genome, enhancers lie outside genes, and are referred to as "non-coding." Mutations in enhancers already have been implicated in human limb malformations, deafness, skeletal abnormalities, other birth defects and cancer, Ahituv said. Additional enhancer mutations may prove to be responsible for many associations between DNA variations and diseases that have been identified in genome-spanning probes to compare people with and without specific diseases, he said.
Working with mice and with human liver-cancer cells grown in the lab, the researchers relied on a powerful new lab technique in order to be able to perform what they describe as a "massively parallel experiment" to explore roles that specific combinations of enhancers play in guiding gene activation.
They designed a diverse library of nearly 5,000 enhancers, consisting of transcription-factor binding sites from 12 known liver-specific transcription factors, and placed each into a DNA package that could be injected into a mouse's tail, move into it's liver, and potentially be activated by transcription factors in the mouse's liver cells. With this technique they were able to measure the ability of each enhancer to interact with liver transcription factors to turn on genes.
A technology developed recently in the laboratory of co-author Jay Shendure, PhD, from the University of Washington, allowed the research team to rapidly obtain a unique read-out — like a genetic bar code — each time one of the enhancers was involved in gene activation.
Leila Taher, PhD, and Ivan Ovcharenko, PhD, of the National Center for Biotechnology Information, part of the National Library of Medicine, also contributed to the study by developing algorithms used to design the synthetic enhancers and to analyze the large amounts of data gathered.
The genetic code was cracked a half-century ago. It specifies how DNA's four nucleic acid, alphabet-building blocks — A, C, T, and G — encode protein. As cellular machinery reads through a gene's long DNA sequences, sequential three-letter combinations of these nucleic acids, called codons, specify which amino acids will in turn be linked together to make the gene-encoded protein.
But molecular biologists have been slower to unravel the mysteries of development as it unfolds through cell division and maturation through different patterns of gene activation, and slower to understand the role of DNA outside of genes.
Additional authors of the Nature Genetics study include postdoctoral fellows Robin Smith, PhD, and Fumitaka Inoue, PhD, and graduate student Mee Kim from UCSF; and Rupali Patwardhan, a graduate student from the University of Washington. The research was funded by the National Institutes of Health, including major funding from the National Human Genome Research Institute, and by the UCSF Liver Center.