The development of a machine-learned scoring function of human preference in the context of early drug discovery campaigns

Download PDF Copy

By Pooja Toshniwal PahariaReviewed by Danielle Ellis, B.Sc.Nov 1 2023

In a recent study published in Nature Communications, researchers developed a scoring mechanism based on artificial intelligence for early drug discovery campaigns that might be utilized for compound prioritizing, motif rationalization, and biased drug design.

*Study: Extracting medicinal chemistry intuition via preference machine learning. Image Credit: Krisana Antharith/Shutterstock.com*

In drug development campaigns, lead optimization entails the time-consuming process of working among several chemists to attain targeted molecular property profiles. Chemists gain experience in areas such as compound prioritization, which enables them to make more efficient judgments. Researchers have explored rule-based techniques and fundamental cheminformatics desirability rankings, but capturing the complexities has proven difficult. Medicinal chemistry, like a human enterprise, is sensitive to subjective biases.

About the study

In the present study, researchers investigated the feasibility of turning medicinal chemists' knowledge into machine-learning models for lead optimization and other drug discovery pipeline choices.

By studying chemical pairings, the researchers created a machine-learning model that could learn from the preferences of 35 medicinal chemists. The model employed a paired learning-to-rank experimental design among molecules, with participants given a straightforward cue to select their preferred compounds.

There were numerous rounds in the study, including two rounds of preliminary analysis with 220 molecular pairs and a production run with nearly 5,000 replies. The inter-rater agreement (i.e., the degree to which one chemist's selections agree with peer selections) was tested using 200 distinct chemical pairings, which intuitively was a straightforward indication of whether an artificial intelligence-based model could learn a signal.

Furthermore, the researchers investigated molecular selection bias based on molecular positions on the screen (right or left) during annotation. The model was trained on a collection of compounds retrieved from the ChEMBL database, with molecular weights and drug-likeness (QED) ranging between 200 and 1,000 g mol-1, and it permitted up to two rule-of-five violations.

The compounds were standardized by removing salt, normalizing tautomers, and neutralizing atoms before being utilized in a preference learning issue. For the subsequent preliminary research round and following manufacturing rounds, the Novartis Institutes for BioMedical Research (NIBR) substructure filters were used, resulting in a 1,831,052-molecule pool. Fragment analysis on various chemicals rationalized model learning.

After each labeled batch of 1,000 data points, the prediction performance of the model was evaluated using the area under the receiver-operating characteristic (AUROC) curve values and randomized fivefold cross-validation.

A strategy similar to the one published in the original QED study was employed to assess whether the learned scores might be used to deprioritize undesired substances. The researchers generated 500 molecules by maximizing and decreasing the learned scoring function using the pre-trained SMILES-based Long Short-Term Memory (LSTM) generative model and the hill-climbing optimization approach. This technique aims to overcome prior research's cognitive bias constraints and increase the effectiveness of machine learning models in the pharmaceutical business.

Results

The data revealed a moderate concordance between the chemists' choices given in the early rounds. Cross-validation findings revealed a consistent increase in accurately classifying pairs performance with increasing data availability, with AUROC values ranging between 0.6 and 0.74 at the 1,000 and 5,000 available pair thresholds, respectively.

The study used implicit scoring systems to build a novel strategy for predicting drug resemblance in drug design. The technique was more accurate than the commonly used QED measure, created from internal comments over years of experience.

The algorithm could accurately learn medicinal chemists' preferences, distinguishing features such as drug-likeness, fingerprint density, and the proportion of allylic oxidation sites. QED was the most associated descriptor, followed by fingerprint density, allylic oxidation regions, atomic contributions to van der Waals surface area, and Hall-Kier kappa values.

With varying kinds of fingerprint densities available, the model could detect higher compounds feature-wise, indicating that the chemists favored higher molecules characteristic-wise.

However, there was a minor positive association with the score measure, indicating that the suggested score preferred synthetically simpler molecules. The SMR VSA3 descriptor measured molecular surface area aggregated using Wildman-Crippen MR values and was modestly negatively correlated, showing that chemists favored compounds with neutral atoms of nitrogen.

For FDA-approved pharmaceuticals and GDB collections, the filtering method yielded 732 and 8,616 examined compounds, respectively. Compared to the GDB set, the distribution of learned scores was well split across sets that better depicted drug-like space [i.e., Drugbank Food and Drug Administration (FDA)-approved pharmaceuticals and ChEMBL].

QED scores were difficult to distinguish between the three sets. Common medicinal chemistry motifs such as pyrazines, pyrimidines, sulfones, imidazoles, oxadiazoles, phenyls, and bicyclic heteroaromatics were among the best-ranked. Compounds with long flexible-type chains, double bond conjugations, unusual groups, reactive components, or more alcohols and carboxylates received excellent marks.

Minimalizing the scoring function, on the other hand, resulted in a significant mixture of aliphatic sp3-type carbons and aromatic rings, suitably sized fragments, and characteristic groups seen in drug-resembling compounds. The high quality of the produced compounds revealed that the scoring model function was highly relevant for de novo drug creation.

Conclusion

Overall, the study findings showed that the latent score machine-learning algorithm might gain medicinal chemists' knowledge, delivering more information on in silico ligand-based attributes or fragment definitions. This method might be used in ordinary cheminformatics activities such as deprioritizing molecules not detected by rule-based techniques or biased molecular design.

Journal reference:

Oh-Hyeon Choung, Riccardo Vianello, Marwin Segler, Nikolaus Stiefl, and José Jiménez-Luna, Extracting medicinal chemistry intuition via preference machine learning, Nature Communications, (2023)14:6651 doi: https://doi.org/10.1038/s41467-023-42242-1

Posted in: Drug Trial News | Medical Science News | Pharmaceutical News

Comments (0)

Written by

Pooja Toshniwal Paharia

Pooja Toshniwal Paharia is an oral and maxillofacial physician and radiologist based in Pune, India. Her academic background is in Oral Medicine and Radiology. She has extensive experience in research and evidence-based clinical-radiological diagnosis and management of oral lesions and conditions and associated maxillofacial disorders.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Toshniwal Paharia, Pooja Toshniwal Paharia. (2023, November 01). The development of a machine-learned scoring function of human preference in the context of early drug discovery campaigns. News-Medical. Retrieved on February 10, 2026 from https://www.news-medical.net/news/20231101/The-development-of-a-machine-learned-scoring-function-of-human-preference-in-the-context-of-early-drug-discovery-campaigns.aspx.
MLA
Toshniwal Paharia, Pooja Toshniwal Paharia. "The development of a machine-learned scoring function of human preference in the context of early drug discovery campaigns". News-Medical. 10 February 2026. <https://www.news-medical.net/news/20231101/The-development-of-a-machine-learned-scoring-function-of-human-preference-in-the-context-of-early-drug-discovery-campaigns.aspx>.
Chicago
Toshniwal Paharia, Pooja Toshniwal Paharia. "The development of a machine-learned scoring function of human preference in the context of early drug discovery campaigns". News-Medical. https://www.news-medical.net/news/20231101/The-development-of-a-machine-learned-scoring-function-of-human-preference-in-the-context-of-early-drug-discovery-campaigns.aspx. (accessed February 10, 2026).
Harvard
Toshniwal Paharia, Pooja Toshniwal Paharia. 2023. The development of a machine-learned scoring function of human preference in the context of early drug discovery campaigns. News-Medical, viewed 10 February 2026, https://www.news-medical.net/news/20231101/The-development-of-a-machine-learned-scoring-function-of-human-preference-in-the-context-of-early-drug-discovery-campaigns.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.