To discover new treatments for genetic disorders, scientists need a thorough knowledge of prior literature to determine the best gene/protein targets and the most promising drugs to test. However, biomedical literature is growing at an explosive rate and often contains conflicting information, making it increasingly time-consuming for researchers to conduct a complete and thorough review.
To address this challenge, Cole Deisseroth, a graduate student enrolled in the M.D./Ph.D. program and mentored by Drs. Huda Zoghbi and Zhandong Liu at the Jan and Duncan Neurological Research Institute (Duncan NRI) at Texas Children's Hospital and Baylor College of Medicine, led a study to generate a natural language processing (NLP) tool called PARsing ModifiErS via Article aNnotations (PARMESAN). This new tool can search for up-to-date information, assemble it into a central knowledge base, and even predict likely drugs that could correct specific protein imbalances. A description of the tool and its capabilities was published recently in the American Journal of Human Genetics.
"PARMESAN offers a wonderful opportunity for scientists to speed up the pace of their research and thus, accelerate drug discovery and development," Howard Hughes Medical Institute investigator, Dr. Huda Zoghbi, who is also the founding director of Duncan NRI and distinguished service professor at Baylor College, added.
This artificial intelligence (AI)-powered tool scans through public biomedical literature databases (PubMed and PubMed Central), to identify and rank descriptions of gene-gene and drug-gene regulatory relationships. However, what stands out about PARMESAN in particular is its ability to leverage curated information to predict undiscovered relationships.
The unique feature of PARMESAN is that it not only identifies existing gene-gene or drug-gene interactions based on the available literature but also predicts putative novel drug-gene relationships by assigning an evidence-based score to each prediction."
Dr. Zhandong Liu, Chief of Computation Sciences at Texas Children's Hospital and associate professor at Baylor College of Medicine
PARMESAN's AI algorithms analyze studies that describe the contributions of various players involved in a multistep genetic pathway. Then it assigns a weighted numerical score to each reported interaction. Interactions that are consistently and frequently reported in the literature receive higher scores, whereas interactions that are either weakly supported or appear to be contradicted between different studies are assigned lower scores.
PARMESAN currently provides predictions for more than 18,000 target genes, and benchmarking studies have suggested that the highest-scoring predictions are over 95% accurate.
"By pinpointing the most promising gene and drug interactions, this tool will allow researchers to identify the most promising drugs at a faster rate and with greater accuracy," Cole Deisseroth, said.