Computational biologists at Carnegie Mellon University have developed an analytical technique to detect the multiple genetic variations that contribute to complex disease syndromes such as diabetes, asthma and cancer, which are characterized by multiple clinical and molecular traits.
Rather than searching one at a time for genetic alterations that cause a particular symptom or trait, as in most conventional approaches, the Carnegie Mellon scientists use a statistical method that enables them to uncover genome variations underlying an entire regulatory network of genes or traits that are responsible for complex diseases.
Professor Eric P. Xing and postdoctoral scientist Seyoung Kim report today in the online journal Public Library of Science (PLoS) Genetics that their graph-guided fused lasso (GFlasso) method showed increased power in detecting gene variants associated with complex symptoms compared with other methods. In one test, GFlasso successfully detected a gene variant already implicated in severe asthma and identified two additional variants that had not previously been associated with the condition. More study of the two variants will be necessary to confirm the association, Xing and Kim said.
"We know that some of the most common and most serious diseases that plague humans are caused not by a single genetic mutation, but by a combination of many genetic and environmental factors," said Xing, an associate professor of machine learning, language technologies and computer science. "Complicating the situation is that most complex diseases have a large number of clinical traits such as various symptoms, body metrics and family history, and that genome-wide gene expression profiling can identify tens of thousands of molecular traits associated with the disease."
Typically, many of these traits are correlated. For example, high blood pressure and high body weight might share some common genetic causes. If someone tests every gene variation with every trait one pair at a time, as is the case in classical methods, the number of tests is humongous and information about the genetic causes of correlated traits is not properly used, resulting in a loss of statistical power, Xing said. "So it's unlikely we can unravel the root causes of diseases such as cancer, diabetes and asthma one gene and one trait at a time," he said. "Rather, we need tools such as GFlasso so we can look for associations between networks of genes and clinical traits."
Severe asthma, for instance, is characterized by more than 50 clinical traits, some related to environment or activity levels, some to symptoms such as wheeziness and tightness of the chest and others to lung physiology. Some of these traits are highly correlated with each other, Xing and Kim noted in the PLoS Genetics article, which suggests a common genetic basis. Their technique takes advantage of these tightly correlated traits by analyzing them jointly. This approach also helps detect genetic variations that might otherwise be missed because they have relatively subtle effects on any individual trait, but are important because they contribute to a number of correlated traits.
"This approach will provide a more comprehensive genetic and molecular view of complex diseases," Xing said, "so we can identify the genes that underlie disease processes, understand the role of genes in determining the severity of disease and develop improved methods for diagnosing disease."
Xing, a member of the Ray and Stephanie Lane Center for Computational Biology at Carnegie Mellon, is working with colleagues at the University of Pittsburgh School of Medicine and Harvard Medical School to use GFlasso to study severe asthma as part of an ongoing study sponsored by the National Institutes of Health.