Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Virtual Screening with Gnina 1.0

Version 1 : Received: 14 November 2021 / Approved: 18 November 2021 / Online: 18 November 2021 (13:59:09 CET)

A peer-reviewed article of this Preprint also exists.

Sunseri, J.; Koes, D.R. Virtual Screening with Gnina 1.0. Molecules 2021, 26, 7369. Sunseri, J.; Koes, D.R. Virtual Screening with Gnina 1.0. Molecules 2021, 26, 7369.

Journal reference: Molecules 2021, 26, 7369
DOI: 10.3390/molecules26237369

Abstract

Virtual screening - predicting which compounds within a specified compound library bind to a target molecule, typically a protein - is a fundamental task in the field of drug discovery. Doing virtual screening well provides tangible practical benefits, including reduced drug development costs, faster time to therapeutic viability, and fewer unforeseen side effects. As with most applied computational tasks, the algorithms currently used to perform virtual screening feature inherent tradeoffs between speed and accuracy. Furthermore, even theoretically rigorous, computationally intensive methods may fail to account for important effects relevant to whether a given compound will ultimately be usable as a drug. Here we investigate the virtual screening performance of the recently released Gnina molecular docking software, which uses deep convolutional networks to score protein-ligand structures. We find, on average, that Gnina outperforms conventional empirical scoring. The default scoring in Gnina outperforms the empirical AutoDock Vina scoring function on 89 of the 117 targets of the DUD-E and LIT-PCBA virtual screening benchmarks with a median 1% early enrichment factor that is more than twice that of Vina. However, we also find that issues of bias linger in these sets, even when not used directly to train models, and this bias obfuscates to what extent machine learning models are achieving their performance through a sophisticated interpretation of molecular interactions versus fitting to non-informative simplistic property distributions.

Keywords

Machine Learning; Deep Learning; Molecular Modeling; Virtual Screening; Drug Discovery

Subject

CHEMISTRY, Medicinal Chemistry

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.