Jul 4 2007
Large-scale undertakings such as the Human Genome Project have produced massive amounts of data.
To make sense of it all, powerful mathematical and statistical algorithms were developed, resulting in the interdisciplinary field called bioinformatics." By probing genome sequence data with in silico tools, biologists can infer the functions of specific genes, evaluate the evolutionary relatedness among species, and identify DNA variants that are predictive of disease. This valuable information can both complement and drive the more traditional laboratory-based experimental studies. This month's release of Cold Spring Harbor Protocols highlights two articles that discuss the principles and applications of cutting-edge bioinformatics software programs; both are freely available online (www.cshprotocols.org).
The first freely available article ( http://www.cshprotocols.org/cgi/content/full/2007/14/pdb.top17) provides guidance on how to use the BLAST (Basic Local Alignment Search Tool) program. BLAST is the tool that is most commonly used to search large databases for DNA and protein sequences with similarities. The article, intended to be a user's guide to BLAST, includes a general overview of its algorithmic basis as well as descriptions of various BLAST programs and their appropriate applications. It will be useful to a wide range of biologists seeking to better understand and apply BLAST to their systems.
The second freely available article describes a computational pipeline that has been optimized to identify DNA variants called SNPs (single-nucleotide polymorphisms) in sequence data from corn. The sequences, generated with cutting-edge technology from 454 Life Sciences, are aligned and anchored to the corn genome using BLAST and cross_match (another computational tool). Then, a third program called PolyBayes is used to search for SNPs in the aligned sequences. The article describing this SNP-discovery pipeline is available at http://www.cshprotocols.org/cgi/content/full/2007/14/pdb.prot4786.