Dozens of new genes identified in X chromosome

By intensely and systematically comparing the human X chromosome to genetic information from chimpanzees, rats and mice, a team of scientists from the United States and India has uncovered dozens of new genes, many of which are located in regions of the chromosome already tied to disease.

Regions of the X chromosome, one of the two sex chromosomes (Y is the other), have been linked to mental retardation and numerous other disorders, but finding the particular genetic abnormalities involved has been difficult.

The team's accomplishment, described in the April issue of Nature Genetics, should speed research into diseases associated with the X chromosome and encourage similar analyses of other chromosomes.

"To our knowledge, this is the first time critical analysis of an entire chromosome has been done by a group that wasn't involved in determining the chromosome's genetic sequence," says study leader Akhilesh Pandey, M.D., Ph.D., an assistant professor in the McKusick-Nathans Institute of Genetic Medicine at Johns Hopkins and chief scientific adviser to the Institute of Bioinformatics (IOB) in Bangalore, India, where the analyses took place. "We didn't start small. We wanted to prove that complete annotation can be done, and done in a way that lets you find new and unexpected things."

For 18 months, 26 Indian scientists pored through the publicly available sequence of the X chromosome (information generated by the Wellcome Trust Sanger Institute in England and others) to identify genes and other important parts of its DNA.

But unlike other efforts, the team didn't just "mine the data" by using computers to search for known patterns in the genetic sequence. Instead, Pandey decided they would look for similarities between the human X chromosome's protein-encoding instructions and corresponding regions in the mouse. Regions that were identical or nearly so were then examined carefully by IOB biologists.

"We didn't want to start out by saying that genes had to look a certain way," says Pandey. "So our only initial assumption was that if a genetic region is important and codes for a protein, the sequence will be conserved at the protein level. Thus, even if the genetic sequence is different here and there, the protein sequence could still be the same."

Essentially, the researchers took advantage of the redundancy inherent in the genetic code. DNA's four building blocks -- A, T, C and G -- act as instructions for proteins in select three-block sets. These three-block sets each "code" for just one of the 20 possible protein building blocks, or amino acids, but some of the sets code for the same amino acid. For example, the DNA sequences TTGAGGAGC and CTACGATCA are quite different, but both specify the same three amino acids -- leucine, arginine and serine, in that order.

"Instead of telling the computer what to look for, we let nature tell the computer what was important," says Pandey. "When you align the protein-encoding instructions of the human and mouse, the genes jump out at you."

In the regions that were the same between species, the scientists found 43 new "gene structures" that encode proteins. Some of the newly identified genes sit in regions long tied to X-linked mental retardation syndromes, which appear only in boys, or other disorders. Quite remarkably, Pandey says, almost half of the new genes don't look like any previously known genes, nor do they look like each other.

"These would not be found any other way, because no one knew to look for them," he says. "No one had ever identified any aspect of their sequences as being important."

The IOB scientists and the U.S. members of the team experimentally investigated a few of the new genes to confirm the comparative approach's validity. Their results, as well as data created by other scientists since the U.S-India team started working, confirm the existence of some of the newly identified genes. The team's work also showed that some so-called pseudogenes on the X chromosome are actually expressed, or transcribed, which contradicts the widespread idea that they are functionless.

"We're really trying to show that complete annotation of chromosomes can be done, and that doing it this way means you can find things you don't expect to find," says Pandey. "It's long, painstaking work, but it's worth it."

Pandey hopes that researchers will take the initiative to annotate sequenced genetic information and validate regions used in their work.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Do long genes hold the key to understanding the genetic underpinnings of aging?