AI in drug discovery: Overcoming the scoring barrier.

19 Mar 2025

How advances in machine learning and AI are shaping the future of drug development by solving the scoring challenge.

by Dr. Neil Taylor

AI in drug discovery has revolutionised the development of new treatments, making the process faster, smarter, and more efficient. As the technology evolves, improving the accuracy of drug candidate scoring is the next step towards even greater breakthroughs. Despite the incredible advances in Structural-Based Drug Discovery (SBDD) over the past five years, with vast increases in available protein structure data, the methods used to score this data have not kept pace. The result? A lack of breakthroughs in scoring functions.

With over 30 years in the drug discovery business, I’ve tracked the potential of machine learning and AI in drug development. The advent of technologies like DeepMind’s AlphaFold2 has been revolutionary, but one critical piece remains elusive: reliable and accurate scoring methods. The challenge lies in the complexity of drug discovery using AI, particularly when it comes to assessing molecular interactions in a vast chemical space.

The challenges in AI drug discovery scoring

AI-enabled drug discovery and development promise to revolutionise how we identify promising drug candidates, but there are several obstacles to overcome. The key issues in scoring, which continue to hinder progress in AI-driven drug discovery, include:

  1. The accuracy-speed tradeoff: More accurate physics-based methods, like quantum mechanics, are computationally expensive. Meanwhile, faster empirical scoring functions may sacrifice accuracy, often missing crucial interactions. Balancing these trade-offs remains one of the key hurdles in drug discovery machine learning applications.
  2. The protein environment: Traditional scoring functions tend to treat proteins as relatively rigid structures. In reality, proteins are dynamic systems with conformational flexibility. Ignoring this flexibility can lead to missed interactions or false positives, especially when protein movement plays a critical role in binding. This is one area where AI drug discovery companies are working on improvements
  3. Water molecules: In drug discovery using machine learning, water molecules play an essential role in molecular recognition. However, they are often oversimplified in scoring functions. Explicit water molecules are costly to simulate, while implicit solvent models may miss critical water-mediated interactions – particularly in binding pockets where water networks are essential. AI drug development companies are investigating ways to improve water modeling in scoring systems.
  4. Entropy: Most scoring functions focus on enthalpic contributions to binding while neglecting entropic effects, such as conformational flexibility and water displacement. AI drug discovery companies are beginning to explore ways to better model entropy and its influence on binding, which could enhance the reliability of scoring functions.
  5. Parameterisation bias: Scoring functions are often trained on specific datasets of protein-ligand complexes, leading to bias in their predictions. This can result in scoring methods that perform well on systems similar to their training data but fail to generalise to novel chemical scaffolds or unexpected binding modes.
  6. Context-dependence: The energetic consequences of molecular interactions can vary depending on the environment. Unfortunately, most scoring functions rely on fixed parameters that do not account for this context-sensitivity. This limitation is something AI-enabled drug discovery companies are actively working to address.

Advancing AI in drug discovery: The path forward

As we look to move past these challenges, it’s clear that a radical shift in thinking is required. The immense datasets now available through AI-driven drug discovery – spanning millions of data points – must be sifted through efficiently to identify the most promising drug candidates. Ranking hits based on machine learning predictions and affinity scoring is more crucial than ever.

Here are some key questions we need to address in the next phase of AI in drug discovery and development:

  1. The role of local networks: How can a network-based approach to protein-ligand binding enhance the accuracy of scoring methods while maintaining computational efficiency? Could this help overcome current limitations in drug discovery machine learning techniques?
  2. The importance of water molecules: Have we underestimated the role that water plays in optimising protein-ligand binding? AI drug discovery companies must explore new ways to model water interactions more precisely.
  3. Modelling entropy: How can we accurately capture the effects of entropy in drug discovery AI models? Understanding the importance of entropy will be key in refining scoring functions.
  4. Identifying edge case patterns: Are there patterns that can explain edge cases in binding interactions, and can these be integrated into AI drug development systems to improve prediction accuracy?
  5. Resolution of AI predicted models: Are the AI models used for drug discovery of sufficient resolution to run reliable scoring functions, especially when critical factors such as water molecules are missing?
  6. Balancing learning and overfitting: How can we strike the right balance between learning from previous data and avoiding overfitting in machine learning models for drug discovery?

A new era for drug discovery: AI and deep learning

The last few years have brought significant advances in our understanding of protein-ligand binding, largely driven by AI-enabled drug discovery technologies and deep learning. These advancements present an exciting opportunity to create new methods in drug discovery. Scoring protein-ligand complexes is a challenging task, but it is one that will ultimately determine the success of AI in drug discovery. We are at an exciting crossroads in this field, and I look forward to discussing the strategies that can lead us to the scoring “holy grail.”

Posted in: Current

Comments: (0)

Leave a Comment