A new Cornell University-led study finds that the genome for a widely researched worm, on which countless studies are based, was flawed. Now, a fresh genome sequence will set the record straight and improve the accuracy of future research.
When scientists study the genetics of an organism, they start with a standard genome sequenced from a single strain that serves as a baseline. It's like a chess board in a chess game: every board is fundamentally the same.
One model organism that scientists use in research is a worm called Caenorhabditis elegans. The worm - the first multicellular eukaryote (animal, plant or fungus) to have its genome sequenced - is easy to grow and has simple biology with no bones, heart or circulatory system. At the same time, it shares many genes and molecular pathways with humans, making it a go-to model for studying gene function, drug treatments, aging and human diseases such as cancer and diabetes.
Genetic studies of C. elegans were based on a single strain, called N2, which researchers have ordered for decades from the C. elegans stock center at the University of Minnesota. Though people tried to uphold a common standard, individual labs grew N2 strains on their own, which led to morphing.
Over the last decade, with more advanced genetic experiments using high levels of DNA sequencing, scientists were alarmed to discover that there is no longer a single laboratory strain that everyone was using. Over 40 years there have arisen many different N2 strains; we can't rely on any one of them to do experiments."
Erich Schwarz, assistant research professor in the Department of Molecular Biology and Genetics
Schwarz is a senior author of a new study published in Genome Research that describes a single genetically clean strain, called VC2010, where each individual is truly identical. Schwarz and colleagues from the University of Tokyo, Stanford University, the University of British Columbia and the University of Minnesota used cutting-edge techniques to sequence VC2010's genome and create a new standard.
As part of the study, the researchers compared VC2010 to the original N2 genome. They expected a near-perfect match, but got a surprise. "Along with the 100 million nucleotides we expected to see, we discovered an extra 2 million nucleotides, an extra two percent of the genome," that was hidden in the original, likely due to limitations of old technology, Schwarz said.
Schwarz added that similar issues are likely occurring in the standard genomes of other organisms, including humans. "It shows us that having the true complete DNA of an animal is not as easy as we thought it was," he said.
Other labs have begun using modern sequencing tools to reassess other genomes, which has implications for synthetic biology, where scientists are creating life - such as bacteria - from scratch. "Having a really good DNA sequence is an important baseline," Schwarz said.