New bioinformatics technique for systematically analyzing key regions in DNA that help control gene activity

Scientists at Lawrence Livermore National Laboratory (LLNL) and the Linnaeus Centre for Bioinformatics (LCB) at Uppsala University in Sweden have developed a new bioinformatics technique for systematically analyzing key regions in DNA that help control gene activity. The cooperative efforts were headed by Krzysztof Fidelis in the United States and by Jan Komorowski in Sweden.

Understanding the complex regulatory mechanisms that tell genes when to switch on and off is one of the toughest challenges facing researchers attempting to discover how life works. "Binding sites," or areas of DNA that interact with the proteins that help control gene expression, can be a long distance on the DNA strand from the genes they influence. Recent research also has shown that gene expression can be controlled by several regulatory proteins working together at a combination of different binding sites.

(Regulatory proteins are known as "transcription factors"; transcription is the first step in the process by which the genetic information in DNA is decoded by the cell to manufacture proteins, the building blocks of life.)

"It's difficult to experimentally observe how transcription factors bind to DNA at a distance from a gene, or how regulation happens," said Fidelis, a computational biologist in Livermore's Biosciences Directorate. "But you can identify their binding sites in a promoter or regulatory region – there are usually a few of these for each gene. We wanted to see if we could somehow deduce how many transcription factors at a time, or combinations of factors, are coming together physically and how these combinations regulate genes."

"To accomplish this," Komorowski said, "we used a machine learning technique called rough sets to mathematically model general rules that could associate known binding sites and gene expression in yeast, which is one of the most widely studied organisms." From the analysis of gene activity under a variety of environmental conditions, the teams were able to develop a set of rules for predicting the location of binding site combinations based on limited binding site and gene expression data.

"We found that the same transcription factors, in slightly different combinations, could be responsible for the regulation of different genes," said Torgeir R. Hvidsten of the LCB. "Thus we now know that binding sites can be combined to allow a large number of expression outcomes using relatively few transcription factors."

Others collaborating in the project were Jerzy Tiuryn of the Faculty of Mathematics, Informatics, and Mechanics at Warsaw University in Poland; Bartosz Wilczynski of the Institute of Mathematics, Polish Academy of Sciences, and LLNL; and Andriy Kryshtafovych of LLNL. A report on the joint work appears in the June issue of the journal Genome Research.

The rough sets technique was developed by Zdzislaw Pawlak in Poland in the 1980s and is particularly suitable to build models from incomplete and uncertain data. It has been used in applications ranging from medical and financial data analysis to voice recognition and image processing. Applied to gene regulation, the approach was able to predict the location of regulatory sites for about one-third of the genes in the yeast genome – a success rate as good as or better than other current techniques.

"The next step is to test this approach on different organisms, including microbes and vertebrates," Fidelis said. The growing number of organisms whose genomes have been sequenced has generated a wealth of DNA sequence information that could provide the raw material for analysis.

http://www.llnl.gov/

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Study uncovers a previously unknown genetic link to autism spectrum disorder