DNAnexus, Inc., a leading provider of cloud-based solutions for analyzing, visualizing, and managing next-generation sequencing data, today announced the availability of a comprehensive set of new informatics tools that enable life science researchers to efficiently analyze and manage large-scale genomic variation datasets in a cloud-based workflow. DNAnexus will discuss results from a study using this solution at the 12th annual Advances in Genome Biology and Technology (AGBT) meeting in Marco Island, Florida.
“Datasets associated with genomic variation are often huge and cumbersome, making it difficult to identify the variations that have a real and measurable impact and help explain, for example, a disease or drug response”
Resequencing for variant detection is a key application enabled by low-cost next-generation sequencing. With an estimated 5 million variants in the human genome, analyzing these complex data sets requires significant computational infrastructure and staff time to manually validate and correlate findings. To streamline this process, researchers need an efficient way to narrow these data down to a manageable size for analysis - ideally to less than a few hundred variants.
By leveraging a unique cloud-based approach, DNAnexus provides a comprehensive variation identification workflow that joins a scalable computational infrastructure with an integrated suite of sophisticated filtering and analysis technologies. Together, they simplify the variation analysis workflow and enable life scientists to better manage and interrogate these complex data from any web browser.
"Datasets associated with genomic variation are often huge and cumbersome, making it difficult to identify the variations that have a real and measurable impact and help explain, for example, a disease or drug response," said Andreas Sundquist, Co-Founder and President of DNAnexus. "As with all our solutions, the variation workflow provides researchers and sequencing centers instant access to a virtually limitless computing infrastructure as well as a suite of sophisticated visualization tools that allow them to quickly home in on the most relevant variants. Whether they are working with a specific gene, a coding region, or entire chromosomes, the result is less time spent analyzing the data, quicker insights and faster decision making."
The complete DNAnexus Variation Identification workflow combines a new population allele frequency analysis application with a nucleotide-level variation analysis to enable the rapid identification of alleles and their frequencies across different populations. A flexible query tool provides further filtering capabilities to streamline the identification of biologically interesting and relevant variants. "Gene Info" pages provide an overview of each gene as well as integrated links to third-party data sources for further investigation and validation and disease impact analysis.
To further support the validation of variant calls and aid researchers in understanding disease impact, the following third-party data sources have been integrated into the DNAnexus Variation Identification workflow: AmiGO, BIOBASE Knowledge Library, Catalogue of Somatic Mutations in Cancer (COSMIC), dbSNP, Entrez Gene, GeneCards®, Ingenuity's Knowledge Base and IPA®, Kyoto Encyclopedia of Genes and Genomes (KEGG), NextBio, Online Mendelian Inheritance in Man (OMIM®), PharmGKB, and Pubmed.
During the poster session at AGBT on Thursday, February 3rd, DNAnexus will discuss the Variation Identification solution in the context of results generated from a population analysis study of next-generation sequencing data derived from the 1000 Genomes Project. The full poster can be viewed at: https://dnanexus.com/news/Poster_1000Genomes_CHB_CHD_Jan2011.pdf.