A team at the Keck School of Medicine of USC, in collaboration with The Michael J. Fox Foundation for Parkinson's Research (MJFF), has just added a crucial new element - RNA sequencing data -- to its robust study of Parkinson's disease.
RNA sequencing involves analyzing blood samples to understand how genetic scripts are expressed in various biological processes, including disease states. By looking at changes in RNA over time in patients who develop Parkinson's, researchers hope to identify some of the earliest hallmarks of the disease that may appear even before symptoms arise.
The massive new dataset is part of the Parkinson's Progression Markers Initiative (PPMI), the MJFF's signature, big-data study of Parkinson's disease that is collecting clinical, biological and imaging data from 1,400 individuals over at least five years. The RNA sequencing data were analyzed within USC's Institute of Translational Genomics and are being stored and shared by the state-of-the-art data center at the Laboratory of Neuro Imaging, a part of USC's Mark and Mary Stevens Neuroimaging and Informatics Institute.
The incredibly rich clinical, imaging and biological data collected for the initiative have already helped researchers better under and characterize Parkinson's disease. The newly added transcriptomic data adds an additional layer of diversity, enabling the identification of relationships between variables that would otherwise be impossible."
Arthur W. Toga, director of USC's Mark and Mary Stevens Neuroimaging and Informatics Institute
The effort is part of an important shift--both in Parkinson's research and more broadly in the study of neurological diseases--toward the direct coupling of data analysis tools with data archives. Until recently, researchers who wished to access existing datasets were required to download massive files and painstakingly organize, harmonize and analyze raw data.
Now, the team has begun co-locating visualization, analytic and data storage technologies so that researchers can instantly conduct preliminary analyses online without downloading any files, while those who wish to perform more in-depth analyses may still access raw data.
PPMI researchers adhere to strict protocols for obtaining and storing data. This robust approach renders the data highly reliable, reducing inconsistencies or potential sources of bias.
"PPMI has built the most robust Parkinson's dataset to date, collecting clinical, imaging and biological information from volunteers over at least five years to better understand disease onset and progression," said Todd Sherer, CEO of the MJFF. "The PPMI RNA Sequencing Project significantly increases the study's value and moves us closer to its goals to better define, measure and treat Parkinson's disease."
David W. Craig, co-director of the Institute of Translational Genomics at the Keck School, and Kendall Van Keuren-Jensen, of the Translational Genomics Research Institute, an affiliate of City of Hope, hope the new dataset can help researchers better understand the disease's progression and even develop targeted therapies.
"A neurological or neurodegenerative disease may exist for 10 to 15 years before we see changes in the brain through imaging," said Craig, whose team is focused on translating genomics technology from bench to bedside. "If we find an indicator in the blood, that can lead to earlier diagnosis and more effective therapeutic interventions."
USC's Mark and Mary Stevens Neuroimaging and Informatics Institute hosts the vast collection of data--the RNA sequencing project alone amounts to 108 terabytes, the equivalent of 47,520,000,000 single-spaced typed pages--along with the open-access portal that combines storage, analytic and visualization functionalities.
It currently stores data from nearly 50,000 subjects in 125 different studies, has also pioneered secure methods for processing, sharing and visualizing large datasets. In addition to hosting PPMI data, the institute's archive is home to the Alzheimer's Disease Neuroimaging Initiative, a global collaboration to define the progression of Alzheimer's disease.
"Now that these data have been collected, our goal is to ensure that they continue to fuel new discoveries," Toga says. "That's what drives our institute to keep finding new and innovative ways to connect researchers with large and increasingly diverse datasets."