DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking

4 Oct 2022  ·  Gabriele Corso, Hannes Stärk, Bowen Jing, Regina Barzilay, Tommi Jaakkola ·

Predicting the binding structure of a small molecule ligand to a protein -- a task known as molecular docking -- is critical to drug design. Recent deep learning methods that treat docking as a regression problem have decreased runtime compared to traditional search-based methods but have yet to offer substantial improvements in accuracy. We instead frame molecular docking as a generative modeling problem and develop DiffDock, a diffusion generative model over the non-Euclidean manifold of ligand poses. To do so, we map this manifold to the product space of the degrees of freedom (translational, rotational, and torsional) involved in docking and develop an efficient diffusion process on this space. Empirically, DiffDock obtains a 38% top-1 success rate (RMSD<2A) on PDBBind, significantly outperforming the previous state-of-the-art of traditional docking (23%) and deep learning (20%) methods. Moreover, while previous methods are not able to dock on computationally folded structures (maximum accuracy 10.4%), DiffDock maintains significantly higher precision (21.7%). Finally, DiffDock has fast inference times and provides confidence estimates with high selective accuracy.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Blind Docking PDBbind EQUIBIND+GNINA Top-1 RMSD (Med.) 4.9 # 1
Blind Docking PDBbind P2RANK+SMINA Top-1 RMSD (%<2) 20.4 # 1
Blind Docking PDBbind GNINA Top-1 RMSD (Med.) 7.7 # 2
Blind Docking PDBBind DIFFDOCK (40) Top-1 RMSD (%<2) 38.2±1.0 # 1
Top-1 RMSD (Med.) 3.30±0.11 # 1
Blind Docking PDBBind QVINAW Top-1 RMSD (%<2) 20.9 # 7
Blind Docking PDBBind GNINA Top-1 RMSD (%<2) 22.9 # 5
Top-1 RMSD (Med.) 7.7 # 9
Blind Docking PDBBind SMINA Top-1 RMSD (%<2) 18.7 # 9
Top-1 RMSD (Med.) 7.1 # 8
Blind Docking PDBBind GLIDE Top-1 RMSD (%<2) 21.8 # 6
Top-1 RMSD (Med.) 9.3 # 10
Blind Docking PDBBind TANKBIND Top-1 RMSD (Med.) 4.0 # 3
Blind Docking PDBBind P2RANK+SMINA Top-1 RMSD (%<2) 20.4 # 8
Top-1 RMSD (Med.) 6.9 # 7
Blind Docking PDBBind P2RANK+GNINA Top-1 RMSD (Med.) 5.5 # 4
Blind Docking PDBBind EQUIBIND+GNINA Top-1 RMSD (%<2) 28.8 # 3
Blind Docking PDBBind DIFFDOCK (10) Top-1 RMSD (%<2) 35.0±1.4 # 2
Top-1 RMSD (Med.) 3.56±0.05 # 2

Methods