Submitted:
09 June 2023
Posted:
12 June 2023
You are already at the latest version
Abstract
Keywords:
Introduction
Exploration of chemoinformatics
Fundamentals of Chemoinformatics
Data Mining and Chemical Databases
Chemical Data Representation
Molecular Descriptors
- 0D Descriptors: These are constitutional or count descriptors, scalar values that describe several atoms, bonds, or functional groups in the molecule. i.e., molecular weight.
- 1D Descriptors: These descriptors capture molecular properties in one dimension along a linear sequence or chain of atoms. i.e., Structural fragments or fingerprints
- 2D Descriptors: These descriptors provide information about the molecular structure and properties within a 2D plane. i.e., Topological polar surface area (TPSA) and graph invariants.
- 3D Descriptors: These descriptors describe the molecular properties in 3D space, considering the spatial arrangement of atoms. i.e., autocorrelation descriptors, substituent constants, surface: volume descriptors, quantum, chemical descriptors, 3D-MoRSE descriptors, WHIM descriptors, GETAWAY descriptors, size, steric, surface, and volume descriptors, etc.
- 4D Descriptors: These descriptors encompass properties that change over time or involve spatiotemporal aspects. i.e., Drug dissolution rate, Volsurf, and GRID or CoMFA methods.
| Descriptor Dimension | Descriptor Type | Example |
|---|---|---|
| 0D | Number of atoms, bonds, and functional groups in the molecule | Molecular weight, LogP (partition coefficient) |
| 1D | Molecular properties in a linear manner | Molecular Formula, SMILES (Simplified Molecular Input Line Entry System) Strings |
| 2D | Topological polar surface area (TPSA) | Molecular fingerprint (e.g., Morgan fingerprint), Constitutional descriptors (e.g., number of atoms, bonds, and rings) |
| 3D | Special properties of a molecule | Molecular shape descriptors (e.g., volume, surface area), Pharmacophore features |
| 4D | Electrostatic potential descriptors with spatiotemporal aspects | Molecular dynamics descriptors, radius of gyration (Rg), solvent accessible surface area (SASA), Time-dependent properties (e.g., dynamic polar surface area (dPSA), time-dependent dipole moment. |
Structure-Activity Relationship (SAR) Analysis
QSAR
QSAR Modeling
Molecular Encoding
Feature Selection
Model training
Machine Learning-based QSAR Modeling
Regression Analysis
K-nearest Neighbor
Support Vector Machine
Convolutional Neural Networks, Recurrent Neural Networks, Deep Neural Networks, and Ensemble methods
Validation of ML-QSAR Models
Interpretability and Explainability of ML-QSAR Models
Conclusions
Funding
Conflict of Interest
References
- Small Molecule Drug Discovery Market Size, Report By 2032. (n.d.). https://www.precedenceresearch.com/small-molecule-drug-discovery-market (Accessed on: 24th May 2023).
- Brown FK (1998) Chapter 35 – Chemoinformatics: What is it How does it Impact Drug Discovery In, J.A. Bristol (Ed.), Annual Reports in Medicinal Chemistry (Vol. 33, pp. 375–384). Academic Press. [CrossRef]
- Polanski, J. (2013). Chemoinformatics. In Elsevier eBooks (pp. 635–676). [CrossRef]
- Gasteiger, J. Chemoinformatics: Achievements and Challenges, a Personal View. Molecules 2016, 21, 151. [Google Scholar] [CrossRef]
- Polanski, J. (2009). Chemoinformatics. In Elsevier eBooks (pp. 459–506). [CrossRef]
- Gasteiger, J. (2003). Handbook of Chemoinformatics. In Wiley eBooks. [CrossRef]
- Varnek, A.; Baskin, I.I. Chemoinformatics as a Theoretical Chemistry Discipline. Molecular Informatics 2011, 30, 20–32. [Google Scholar] [CrossRef] [PubMed]
- Chemoinformatics and Computational Chemical Biology. (2011). In Methods in molecular biology. Springer Science+Business Media. [CrossRef]
- Kapetanovic, I.M. Computer-aided drug discovery and development (CADDD): In silico-chemico-biological approach. Chemico-Biological Interactions 2008, 171, 165–176. [Google Scholar] [CrossRef] [PubMed]
- Rutz, A., Sorokina, M., Galgonek, J., Mietchen, D., Willighagen, E., Gaudry, A.,... & Allard, P.M. (2021). The LOTUS initiative for open natural products research: knowledge management through Wikidata. BioRxiv, 2021-02.
- Sorokina, M.; Steinbeck, C. Review on natural products databases: where to find data in 2020. Journal of cheminformatics 2020, 12, 20. [Google Scholar] [CrossRef] [PubMed]
- Banerjee, P.; Erehman, J.; Gohlke, B.O.; Wilhelm, T.; Preissner, R.; Dunkel, M. Super Natural II—a database of natural products. Nucleic acids research 2015, 43(D1), D935–D939. [Google Scholar] [CrossRef] [PubMed]
- Zeng, X., Zhang, P., He, W., Qin, C., Chen, S., Tao, L., ... & Chen, Y.Z. NPASS: natural product activity and species source database for natural product research, discovery and tool development. Nucleic acids research 2018, 46(D1), D1217–D1222.
- Wu, Y.; Zhang, F.; Yang, K.; Fang, S.; Bu, D.; Li, H.; Chen, J. SymMap: an integrative database of traditional Chinese medicine enhanced by symptom mapping. Nucleic acids research 2019, 47(D1), D1110–D1117. [Google Scholar] [CrossRef] [PubMed]
- Ru, J., Li, P., Wang, J., Zhou, W., Li, B., Huang, C., ... & Yang, L. TCMSP: a database of systems pharmacology for drug discovery from herbal medicines. Journal of cheminformatics 2014, 6, 1–6.
- Xue, R., Fang, Z., Zhang, M., Yi, Z., Wen, C., & Shi, T. TCMID: traditional Chinese medicine integrative database for herb molecular mechanism analysis. Nucleic acids research 2012, 41(D1), D1089–D1095.
- Chemoinformatics: Basic Concepts and Methods. (2018, ). Wiley.com. https://www.wiley.com/en-dk/Chemoinformatics:+Basic+Concepts+and+Methods-p-9783527331093. 1 August 9783.
- Xue, H.; Stanley-Baker, M.; Kong, A.W.K.; Li, H.; Goh, W.W.B. Data considerations for predictive modeling applied to the discovery of bioactive natural products. Drug Discovery Today 2022, 27, 2235–2243. [Google Scholar] [CrossRef]
- Mendez, D.; Gaulton, A.; Bento, A.P.; Chambers, J.; De Veij, M.; Félix, E.; Leach, A.R. ChEMBL: towards direct deposition of bioassay data. Nucleic acids research 2019, 47(D1), D930–D940. [Google Scholar] [CrossRef]
- Gilson, M.K.; Liu, T.; Baitaluk, M.; Nicola, G.; Hwang, L.; Chong, J. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic acids research 2016, 44(D1), D1045–D1053. [Google Scholar] [CrossRef]
- Wishart, D.S.; Feunang, Y.D.; Guo, A.C.; Lo, E.J.; Marcu, A.; Grant, J.R.; Wilson, M. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic acids research 2018, 46(D1), D1074–D1082. [Google Scholar] [CrossRef]
- Siramshetty, V.B., Grishagin, I., Nguyễn, Ð.T., Peryea, T., Skovpen, Y., Stroganov, O., ... & Southall, N.T. NCATS Inxight Drugs: a comprehensive and curated portal for translational research. Nucleic Acids Research 2022, 50, D1307–D1316.
- Sussman, J.L.; Lin, D.; Jiang, J.; Manning, N.O.; Prilusky, J.; Ritter, O.; Abola, E.E. Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules. Acta Crystallographica Section D: Biological Crystallography 1998, 54, 1078–1084. [Google Scholar] [CrossRef]
- Haghighatlari, M.; Li, J.; Heidar-Zadeh, F.; Liu, Y.; Guan, X.; Head-Gordon, T. Learning to Make Chemical Predictions: The Interplay of Feature Representation, Data, and Machine Learning Methods. Chem 2020, 6, 1527–1542. [Google Scholar] [CrossRef]
- David, L.; Thakkar, A.; Mercado, R.; Engkvist, O. Molecular representations in AI-driven drug discovery: a review and practical guide. Journal of Cheminformatics 2020, 12. [Google Scholar] [CrossRef]
- Rahman, R.; Dhruba, S.R.; Ghosh, S.; Pal, R. Functional random forest with applications in dose-response predictions. Sci. Rep. 2019, 9, 1628. [Google Scholar] [CrossRef] [PubMed]
- Pang, X.; Fu, W.; Wang, J.; Kang, D.; Xu, L.; Zhao, Y.; Liu, A.L.; Du, G.H. Identification of Estrogen Receptor α Antagonists from Natural Products via In Vitro and In Silico Approaches. Oxid. Med. Cell. Longev. 2018, 2018, 6040149. [Google Scholar] [CrossRef]
- Feinberg, E.N.; Joshi, E.; Pande, V.S.; Cheng, A. Improvement in ADMET Prediction with Multitask Deep Featurization. Journal of Medicinal Chemistry 2020, 63, 8835–8848. [Google Scholar] [CrossRef] [PubMed]
- Wei, Y.; Li, W.; Du, T.; Hong, Z.; Lin, J. Targeting HIV/HCV Coinfection Using a Machine Learning-Based Multiple Quantitative Structure-Activity Relationships (Multiple QSAR) Method. Int. J. Mol. Sci. 2019, 20, 3572. [Google Scholar] [CrossRef]
- Xiong, J.; Xiong, Z.; Chen, K.; Jiang, H.; Zheng, M. Graph neural networks for automated de novo drug design. Drug Discovery Today 2021, 26, 1382–1393. [Google Scholar] [CrossRef] [PubMed]
- Kubinyi H. Evolutionary variable selection in regression PLS analyses. J. Chemom 1996, 10, 119–133. [CrossRef]
- Eriksson, L.; Jaworska, J.; Worth, A.; Cronin, M.T.D.; McDowell, R.; Gramatica, P. Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environmental Health Perspectives 2003, 111, 1361–1375. [Google Scholar] [CrossRef] [PubMed]
- Gasteiger, J. (2003b). Handbook of Chemoinformatics. In Wiley eBooks. [CrossRef]
- Dehmer, M.; Varmuza, K. ; Bonchev D (2012) Statistical Modelling of Molecular Descriptors in, Q.S.A.R./.Q.S.P.R. In Wiley-VCH Verlag GmbH & Co. KGaA eBooks. [CrossRef]
- Lo, Y.; Rensi, S.E.; Torng, W.; Altman, R.B. Machine learning in chemoinformatics and drug discovery. Drug Discovery Today 2018, 23, 1538–1546. [Google Scholar] [CrossRef] [PubMed]
- Chandrasekaran, B., Abed, S.N., Al-Attraqchi, O., Kuche, K., & Tekade, R.K. (2018). Computer-Aided Prediction of Pharmacokinetic (ADMET) Properties. In Elsevier eBooks (pp. 731–755). [CrossRef]
- Basic overview of chemoinformatics. (2006). PubMed. [CrossRef]
- Ash, J.; Fourches, D. Characterizing the Chemical Space of ERK2 Kinase Inhibitors Using Descriptors Computed from Molecular Dynamics Trajectories. Journal of Chemical Information and Modeling 2017, 57, 1286–1299. [Google Scholar] [CrossRef] [PubMed]
- Concepts and Experimental Protocols of Modelling and Informatics in Drug Design. (n.d.). ScienceDirect. https://www.sciencedirect.com/book/9780128205464/concepts-and-experimental-protocols-of-modelling-and-informatics-in-drug-design. 9780.
- Machine learning descriptors for molecules. (2021, January 5). ChemIntelligence. https://chemintelligence.com/blog/machine-learning-descriptors-molecules (Accessed on: 14th May 2023). 1: (Accessed on, 5 January 2023.
- Bajorath, J. Molecular Similarity Concepts for Informatics Applications. Methods in molecular biology (Clifton, N.J.) 2017, 1526, 231–245. [Google Scholar] [CrossRef] [PubMed]
- Sun, H.; Tawa, G.J.; Wallqvist, A. Classification of scaffold-hopping approaches. Drug Discovery Today 2012, 17, 310–324. [Google Scholar] [CrossRef]
- Zheng, S.; Lei, Z.; Haitao, A.; Chen, H.; Deng, D.; Yang, Y. Deep scaffold hopping with multimodal transformer neural networks. Journal of Cheminformatics 2021, 13. [Google Scholar] [CrossRef]
- Jenkins, J.L.; Glick, M.; Davies, J. A 3D Similarity Method for Scaffold Hopping from Known Drugs or Natural Ligands to New Chemotypes. Journal of Medicinal Chemistry 2004, 47, 6144–6159. [Google Scholar] [CrossRef]
- Grisoni, F.; Merk, D.; Consonni, V.; Hiss, J.A.; Tagliabue, S.G.; Todeschini, R.; Schneider, G. Scaffold hopping from natural products to synthetic mimetics by holistic molecular similarity. Communications Chemistry 2018, 1. [Google Scholar] [CrossRef]
- Bhattacharjee, H. , Burns, J. , & Vlachos, D.G. AIMSim: An accessible cheminformatics platform for similarity operations on chemicals datasets. Computer Physics Communications 2023, 283, 108579. [Google Scholar] [CrossRef]
- Luo, M.; Wang, X.S.; Tropsha, A. Comparative Analysis of QSAR-based vs. Chemical Similarity Based Predictors of GPCRs Binding Affinity. Molecular informatics 2016, 35, 36–41. [Google Scholar] [CrossRef] [PubMed]
- Dong, J.; Yao, Z.; Zhu, M.; Wang, N.; Lu, B.; Chen, A.F.; Lu, A.; Miao, H.; Zeng, W.; Cao, D. ChemSAR: an online pipelining platform for molecular SAR modeling. Journal of Cheminformatics 2017, 9. [Google Scholar] [CrossRef] [PubMed]
- Yoshimori, A.; Bajorath, J. The SAR Matrix Method and an Artificially Intelligent Variant for the Identification and Structural Organization of Analog Series, SAR Analysis, and Compound Design. Molecular Informatics 2020, 39, 2000045. [Google Scholar] [CrossRef] [PubMed]
- Hu, H.; Bajorath, J. Systematic assessment of structure-promiscuity relationships between different types of kinase inhibitors. Bioorganic & Medicinal Chemistry 2021, 41. [Google Scholar] [CrossRef]
- Yoshimori, A.; Hu, H.; Bajorath, J. Adapting the DeepSARM approach for dual-target ligand design. Journal of computer-aided molecular design 2021, 35, 587–600. [Google Scholar] [CrossRef] [PubMed]
- Lo, Y.; Rensi, S.E.; Torng, W.; Altman, R.B. Machine learning in chemoinformatics and drug discovery. Drug Discovery Today 2018, 23, 1538–1546. [Google Scholar] [CrossRef]
- Jiménez-Luna, J.; Grisoni, F.; Weskamp, N.; Schneider, G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opinion on Drug Discovery 2021, 16, 949–959. [Google Scholar] [CrossRef]
- Paul, D.; Sanap, G.; Shenoy, S.; Kalyane, D.; Kalia, K.; Tekade, R.K. Artificial intelligence in drug discovery and development. Drug Discovery Today 2021, 26, 80–93. [Google Scholar] [CrossRef]
- Priya, S.; Kumar, A.; Singh, D.B.; Jain, P.; Tripathi, G. Machine learning approaches and their applications in drug discovery and design. Chemical Biology & Drug Design 2022, 100, 136–153. [Google Scholar] [CrossRef]
- Chakravarti, S.K.; Alla, S.R.M. Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks. Frontiers in Artificial Intelligence 2019, 2. [Google Scholar] [CrossRef] [PubMed]
- Ponzoni, I.; Sebastián-Pérez, V.; Requena-Triguero, C.; Roca, C.P.; Martínez, M.J.; Cravero, F.; Díaz, M.P.M.; Páez, J.A.; Arrayás, R.G.; Adrio, J.; Campillo, N.E. Hybridizing Feature Selection and Feature Learning Approaches in QSAR Modeling for Drug Discovery. Scientific Reports 2017, 7. [Google Scholar] [CrossRef]
- Tropsha, A. Best Practices for QSAR Model Development, Validation, and Exploitation. Molecular Informatics 2010, 29, 476–488. [Google Scholar] [CrossRef] [PubMed]
- Tsou, L.K.; Yeh, S.H.; Ueng, S.; Chang, C.; Song, J.; Wu, M.; Chang, H.T.; Chen, S.; Shih, C.; Chen, C.; Ke, Y. Comparative study between deep learning and QSAR classifications for TNBC inhibitors and novel GPCR agonist discovery. Scientific Reports 2020, 10. [Google Scholar] [CrossRef] [PubMed]
- Duchowicz, P.R. Linear Regression QSAR Models for Polo-Like Kinase-1 Inhibitors. Cells 2018, 7, 13. [Google Scholar] [CrossRef] [PubMed]
- Cardoso-Silva, J.; Papageorgiou, L.G.; Tsoka, S. Network-based piecewise linear regression for QSAR modelling. Journal of Computer-Aided Molecular Design 2019, 33, 831–844. [Google Scholar] [CrossRef]
- Dudek, A.Z.; Arodz, T.; Galvez, J. Computational Methods in Developing Quantitative Structure-Activity Relationships (QSAR): A Review. Combinatorial Chemistry & High Throughput Screening 2006, 9, 213–228. [Google Scholar] [CrossRef]
- Raevsky, O.A.; Sapegin, A.; Zefirov, N.S. The QSAR Discriminant-Regression Model. Quantitative Structure-activity Relationships 1994, 13, 412–418. [Google Scholar] [CrossRef]
- Doreswamy; Vastrad, B. Predictive Comparative Qsar Analysis of as 5-Nitrofuran-2-Yl Derivatives Myco Bacterium Tuberculosis H37RV Inhibitors. Healthcare Informatics: An International Journal 2013, 2, 47–62. [CrossRef]
- Ajmani, S.; Jadhav, K.M.; Kulkarni, S.A. Three-Dimensional QSAR Using the k-Nearest Neighbor Method and Its Interpretation. Journal of Chemical Information and Modeling 2006, 46, 24–31. [Google Scholar] [CrossRef]
- Keiser, M.J.; Roth, B.L.; Armbruster, B.N.; Ernsberger, P.; Irwin, J.J.; Shoichet, B.K. Relating protein pharmacology by ligand chemistry. Nature Biotechnology 2007, 25, 197–206. [Google Scholar] [CrossRef]
- Ajmani, S., Jadhav, K.M., & Kulkarni, S.A. (2006b). Three-Dimensional QSAR Using the k-Nearest Neighbor Method and Its Interpretation. Journal of Chemical Information and Modeling, 46, 24–31. [CrossRef]
- Raj, N., & Jain, S. (2011). 3d QSAR studies in conjunction with k-nearest neighbor molecular field analysis (k-NN-MFA) on a series of. . . ResearchGate. https://www.researchgate.net/publication/294708142_3d_QSAR_studies_in_conjunction_with_k-nearest_neighbor_molecular_field_analysis_k-NN-MFA_on_a_series_of_substituted_2-phenyl-benzimidazole_derivatives_as_an_anti_allergic_agents.
- Asikainen, A.H.; Ruuskanen, J.; Tuppurainen, K.A. Consensus kNN QSAR: a versatile method for predicting the estrogenic activity of organic compounds in silico. A comparative study with five estrogen receptors and a large, diverse set of ligands. Environmental science & technology 2004, 38, 6724–6729. [Google Scholar] [CrossRef]
- Nigsch, F.; Bender, A.; Van Buuren, B.N.; Tissen, J.; Nigsch, E.A.; Mitchell, J.C. Melting Point Prediction Employing k-Nearest Neighbor Algorithms and Genetic Parameter Optimization. Journal of Chemical Information and Modeling 2006, 46, 2412–2422. [Google Scholar] [CrossRef] [PubMed]
- Poroikov, V.V.; Filimonov, D.A.; Borodina, Y.V.; Lagunin, A.A.; Kos, A. Robustness of biological activity spectra predicting by computer program PASS for noncongeneric sets of chemical compounds. Journal of chemical information and computer sciences 2000, 40, 1349–1355. [Google Scholar] [CrossRef] [PubMed]
- Chen, B.; Sheridan, R.P.; Hornak, V.; Voigt, J.H. Comparison of random forest and Pipeline Pilot Naïve Bayes in prospective QSAR predictions. Journal of chemical information and modeling 2012, 52, 792–803. [Google Scholar] [CrossRef]
- Kupervasser, O. (2019). Quantitative Structure-Activity Relationship Modeling and Bayesian Networks: Optimality of Naive Bayes Model. In IntechOpen eBooks. [CrossRef]
- Eklund, M.; Norinder, U.; Boyer, S.; Carlsson, L. Choosing Feature Selection and Learning Algorithms in QSAR. Journal of Chemical Information and Modeling 2014, 54, 837–843. [Google Scholar] [CrossRef] [PubMed]
- Bender, A., Jenkins, J.L., Glick, M., Deng, Z., Nettles, J.H., & Davies, J.W. "Bayes affinity fingerprints" improve retrieval rates in virtual screening and define orthogonal bioactivity space: when are multitarget drugs a feasible concept? Journal of chemical information and modeling 2006, 46, 2445–2456. [CrossRef]
- Keyvanpour, M.R.; Shirzad, M.B. An Analysis of QSAR Research Based on Machine Learning Concepts. Current drug discovery technologies 2021, 18, 17–30. [Google Scholar] [CrossRef] [PubMed]
- Bugeac, C.A., Ancuceanu, R., & Dinu, M. QSAR Models for Active Substances against Pseudomonas aeruginosa Using Disk-Diffusion Test Data. Molecules 2021, 26, 1734. [CrossRef]
- Darnag, R., Schmitzer, A.R., Belmiloud, Y., Villemin, D., Jarid, A., Chait, A., Seyagh, M., & Cherqaoui, D. QSAR Studies of HEPT Derivatives Using Support Vector Machines. Qsar & Combinatorial Science 2009, 28, 709–718. [CrossRef]
- Niu, B.; Lu, W.; Yang, S.; Cai, Y.; Li, G. Support vector machine for SAR/QSAR of phenethyl-amines. Acta Pharmacologica Sinica 2007, 28, 1075–1086. [Google Scholar] [CrossRef]
- Wu, Z.; Zhu, M.; Kang, Y.; Leung, E.L.; Lei, T.; Shen, C.; Jiang, D.; Wang, Z.; Cao, D.; Hou, T. Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets. Briefings in Bioinformatics 2021, 22. [Google Scholar] [CrossRef]
- Alvarsson, J.; Lampa, S.; Schaal, W.; Andersson, C.; Wikberg, J.E.S.; Spjuth, O. Large-scale ligand-based predictive modelling using support vector machines. Journal of Cheminformatics 2016, 8. [Google Scholar] [CrossRef] [PubMed]
- Liu, H.X., Zhang, R.S., Yao, X.J., Liu, M.C., Hu, Z.D., & Fan, B.T. QSAR study of ethyl 2-[(3-methyl-2,5-dioxo(3-pyrrolinyl))amino]-4-(trifluoromethyl) pyrimidine-5-carboxylate: an inhibitor of AP-1 and NF-kappa B mediated gene expression based on support vector machines. Journal of chemical information and computer sciences 2003, 43, 1288–1296. [CrossRef]
- Nekoei, M., Mohammadhosseini, M., & Pourbasheer, E. QSAR study of VEGFR-2 inhibitors by using genetic algorithm-multiple linear regressions (GA-MLR) and genetic algorithm-support vector machine (GA-SVM): a comparative approach. Medicinal Chemistry Research 2015, 24, 3037–3046. [CrossRef]
- Torng, W., & Altman, R.B. 3D deep convolutional neural networks for amino acid environment similarity analysis. BMC Bioinformatics 2017, 18. [CrossRef]
- Olivecrona, M.; Blaschke, T.; Engkvist, O.; Chen, H. Molecular de-novo design through deep reinforcement learning. Journal of Cheminformatics 2017, 9. [Google Scholar] [CrossRef]
- Graves, A. , Mohamed, A., & Hinton, G.E. (2013). Speech Recognition with Deep Recurrent Neural Networks. arXiv (Cornell University). [CrossRef]
- Segler, M.H.S.; Kogej, T.; Tyrchan, C.; Waller, M.P. Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks. ACS Publications 2018, 4, 120–131. [Google Scholar] [CrossRef]
- Kingma, D.P. (2013, ). Auto-Encoding Variational Bayes. arXiv.org. https://arxiv.org/abs/1312.2661. 20 December 6114. [Google Scholar]
- Goodfellow, I.J. (2014, ). Generative Adversarial Networks. arXiv.org. https://arxiv.org/abs/1406.2661. 10 June 2661. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.; Veness, J.; Bellemare, M.F.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; Petersen, S.; Beattie, C.; Sadik, A.; Antonoglou, I.; King, H.; Kumaran, D.; Wierstra, D.; Legg, S.; Hassabis, D. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Kusner, M.J. (2017, ). Grammar Variational Autoencoder. arXiv.org. https://arxiv.org/abs/1703.01925. 6 March 0192. [Google Scholar]
- Matsuzaka, Y.; Uesawa, Y. Optimization of a Deep-Learning Method Based on the Classification of Images Generated by Parameterized Deep Snap a Novel Molecular-Image-Input Technique for Quantitative Structure–Activity Relationship (QSAR) Analysis. Frontiers in Bioengineering and Biotechnology 2019, 7. [Google Scholar] [CrossRef] [PubMed]
- Karpov, P.; Godin, G.; Tetko, I.V. Transformer-CNN: Swiss knife for QSAR modeling and interpretation. Journal of Cheminformatics 2020, 12. [Google Scholar] [CrossRef] [PubMed]
- Xu, Y. (2023). Development and Evaluation of Conformal Prediction Methods for QSAR. arXiv.org. https://arxiv.org/abs/2304.00970. 0097. [Google Scholar]
- Shayanfar, S.; Shayanfar, A. Comparison of various methods for validity evaluation of QSAR models. BMC chemistry 2022, 16, 63. [Google Scholar] [CrossRef] [PubMed]
- Shayanfar, S.; Shayanfar, A. Comparison of various methods for validity evaluation of QSAR models. BMC Chemistry 2022, 16, 63. [Google Scholar] [CrossRef] [PubMed]
- Lo, Y.; Rensi, S.E.; Torng, W.; Altman, R.B. Machine learning in chemoinformatics and drug discovery. Drug Discovery Today 2018, 23, 1538–1546. [Google Scholar] [CrossRef] [PubMed]
- Golbraikh, A., Wang, X., Zhu, H., & Tropsha, A. (2017). Predictive QSAR Modeling: Methods and Applications in Drug Discovery and Chemical Risk Assessment. In Springer eBooks (pp. 2303–2340). [CrossRef]
- Spiegel, J.; Senderowitz, H. Evaluation of QSAR Equations for Virtual Screening. International journal of molecular sciences 2020, 21, 7828. [Google Scholar] [CrossRef] [PubMed]
- Matveieva, M.; Polishchuk, P.G. Benchmarks for interpretation of QSAR models. Journal of Cheminformatics 2021, 13. [Google Scholar] [CrossRef]
- C3.ai. (2022). LIME: Local Interpretable Model-Agnostic Explanations. C3 AI. https://c3.ai/glossary/data-science/lime-local-interpretable-model-agnostic-explanations/#:~:text=What%20is%20Local%20Interpretable%20Model,to%20explain%20each%20individual%20prediction.
- Molnar, C. (2023, March 2). 9.6 SHAP (SHapley Additive exPlanations) | Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/shap.html. 2 March.
- Izrailev, S.; Agrafiotis, D. A method for quantifying and visualizing the diversity of QSAR models. Journal of Molecular Graphics & Modelling 2004, 22, 275–284. [Google Scholar] [CrossRef]
- An, Y.; Sherman, W.; Dixon, S.L. Kernel-Based Partial Least Squares: Application to Fingerprint-Based QSAR with Model Visualization. Journal of Chemical Information and Modeling 2013, 53, 2312–2321. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
