Preprint Article Version 1 This version is not peer-reviewed

Computational Models Using Multiple Machine Learning Algorithms for Predicting Drug Hepatotoxicity with the DILIrank Dataset

Version 1 : Received: 12 February 2020 / Approved: 14 February 2020 / Online: 14 February 2020 (02:24:04 CET)

A peer-reviewed article of this Preprint also exists.

Ancuceanu, R.; Hovanet, M.V.; Anghel, A.I.; Furtunescu, F.; Neagu, M.; Constantin, C.; Dinu, M. Computational Models Using Multiple Machine Learning Algorithms for Predicting Drug Hepatotoxicity with the DILIrank Dataset. Int. J. Mol. Sci. 2020, 21, 2114. Ancuceanu, R.; Hovanet, M.V.; Anghel, A.I.; Furtunescu, F.; Neagu, M.; Constantin, C.; Dinu, M. Computational Models Using Multiple Machine Learning Algorithms for Predicting Drug Hepatotoxicity with the DILIrank Dataset. Int. J. Mol. Sci. 2020, 21, 2114.

Journal reference: Int. J. Mol. Sci. 2020, 21, 2114
DOI: 10.3390/ijms21062114

Abstract

Drug induced liver injury (DILI) remains one of the challenges in the safety profile of both authorized drugs and candidate drugs and predicting hepatotoxicity from the chemical structure of a substance remains a challenge worth pursuing, being also coherent with the current tendency for replacing non-clinical tests with in vitro or in silico alternatives. In 2016 a group of researchers from FDA published an improved annotated list of drugs with respect to their DILI risk, constituting “the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans”, DILIrank. This paper is one of the few attempting to predict liver toxicity using the DILIrank dataset. Molecular descriptors were computed with the Dragon 7.0 software, and a variety of feature selection and machine learning algorithms were implemented in the R computing environment. Nested (double) cross-validation was used to externally validate the models selected. A number of 78 models with reasonable performance have been selected and stacked through several approaches, including the building of multiple meta-models. The performance of the stacked models was slightly superior to other models published. The models were applied in a virtual screening exercise on over 100,000 compounds from the ZINC database and about 20% of them were predicted to be non-hepatotoxic.

Subject Areas

DILIrank; DILI; drug hepatotoxicity; QSAR; nested cross-validation; virtual screening; in silico

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.