Preprint
Review

Unlocking the Future of Drug Development: Generative AI, Digital Twins, and Beyond

This version is not peer-reviewed.

Submitted:

15 March 2024

Posted:

15 March 2024

You are already at the latest version

A peer-reviewed article of this preprint also exists.

Abstract
This article delves into the intersection of generative AI and digital twins within drug discovery, exploring their synergistic potential to revolutionize pharmaceutical research and development. Through various instances and examples, we illuminate how generative AI algorithms, capable of simulating vast chemical spaces and predicting molecular properties, are increasingly integrated with digital twins of biological systems to expedite drug discovery. By harnessing the power of computational models and machine learning, researchers can design novel compounds tailored to specific targets, optimize drug candidates, and simulate their behavior within virtual biological environments. This paradigm shift offers unprecedented opportunities for accelerating drug development, reducing costs, and, ultimately, improving patient outcomes. As we navigate this rapidly evolving landscape, collaboration between interdisciplinary teams and continued innovation will be paramount in realizing the promise of generative AI and digital twins in advancing drug discovery.
Keywords: 
generative AI; drug development; digital twins; prospective analysis
Subject: 
Computer Science and Mathematics  -   Artificial Intelligence and Machine Learning

1. Introduction

Decades back, Alan Turing, often hailed as the visionary behind modern computer science, stirred the imagination of generations with his bold inquiry into the essence of machine cognition. In his seminal work "Computing Machinery and Intelligence," he challenged the boundaries of human understanding by introducing the enigmatic Turing Test, sparking a timeless debate on the capacity of machines to think and emulate human intelligence honestly [2]. His question, "Can machines think?" eventually led to the birth of artificial intelligence (AI). The term AI was coined in 1956 by John McCarthy during The Dartmouth Summer Research Project on Artificial Intelligence [4]. Progressively, in the late 1950s, two key milestones were set: Arthur Samuel created the first self-learning program for checkers, marking the introduction of machine learning (ML), and Frank Rosenblatt developed the first perceptron, representing the earliest form of a neural network [5]. One of the earliest instances of functional generative AI was the ELIZA chatbot, developed by Joseph Weizenbaum in 1961 [6]. These milestones laid the foundation for the evolution of AI and its applications in various fields, including medicine and drug discovery.
In recent years, a subset of AI - generative AI has undergone significant advancements, transforming various domains by generating realistic content. Generative AI refers to a category of models designed to generate new content similar to, but not the same as, the input data it was trained on [7,8,9]. Unlike traditional AI systems that are often task-specific and deterministic, generative AI systems can produce novel outputs by learning the underlying patterns and structures of the training data. Initially utilizing models like Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs), the field witnessed a breakthrough with the introduction of Generative Adversarial Networks (GANs) by Ian Goodfellow in 2014. GANs have been implemented in cardiology and used for detecting Pneumonia, COVID-19, etc., as discussed later in this review [10,11,12,13,14,15]. Other models like recurrent neural networks (RNNs) and transformers have also contributed to the progress of generative AI, enabling tasks in natural language processing (NLP), computer vision (CV), and Digital Twins (DT). These models are being increasingly applied to improve healthcare and medicine [16,17,18,19,20,21,22,23,24,25,26]. Such advancements in generative AI have enhanced content generation capabilities and found applications specifically in drug discovery, where models are employed to design novel molecules with desired properties to accelerate drug development and clinical trials, which will be discussed throughout this review.
Identifying and prioritizing chemical compounds for drug development can pose significant challenges, as determining which compounds are most promising for treating specific diseases requires extensive laboratory screening and testing. Generative AI streamlines this process by leveraging advanced chemistry models to analyze millions of known chemical compounds based on their structure and functionality. By overlaying this data with existing results from tested molecules, generative AI accelerates the screening process and aids in identifying compounds with the highest potential for successful treatment [27,28]. Research from the Tufts Centre for the Study of Drug Development indicates that bringing a single drug to market typically requires ten years and $1.4 billion, with about 80 percent of expenses attributed to clinical development [29]. This phase involves rigorous testing of a medication's safety and efficacy in human subjects, characterized by lengthy timelines and strict regulatory requirements. Generative AI addresses these challenges by increasing efficiency across clinical development. It achieves up to 50 percent cost reductions by streamlining trial processes and automating document drafting, shortens trial timelines by over 12 months, and enhances Net Present Value by at least 20 percent through improved health authority interactions, quality control, and signal management. McKinsey Global Institute estimates that generative AI could yield $60 billion to $110 billion annually for the pharmaceutical and medical-product sectors [30]. This potential economic value stems from its ability to enhance productivity by expediting compound identification, accelerating drug development and approval, and refining marketing strategies.

2. Drug Discovery and Generative AI

1.1. Generative Adversarial Networks (GANs)

GANs consist of two neural networks, a generator, and a discriminator, which are trained simultaneously in a minimax. The generator creates synthetic data samples, while the discriminator distinguishes between natural and synthetic samples. GANs are increasingly recognized as powerful tools in the realm of drug discovery [7,31,32,33]. They offer innovative approaches to exploring chemical space, refining known compounds, and crafting entirely new molecules. These networks find utility across various drug design and discovery stages, from creating molecules from scratch to reducing complexity and even designing peptides and proteins from scratch. GANs have proven pivotal in developing new molecules with specific attributes, aiding in developing practical drugs and expediting the drug discovery timeline [34,35,36,37,38,39].
Recent research emphasizes the benefits of GANs in drug discovery, showcasing their ability to uncover novel molecules and navigate challenges like mode collapse by encouraging exploration beyond existing data. Specialized GAN architectures like MedGAN leverage graph convolutional networks to efficiently design fresh molecules, addressing the growing need for new medications and enhancing the overall drug discovery process [33,40,41]. GANs are also applied in de novo peptide and protein design, contributing significantly to exploring new bioactive compounds [42]. Furthermore, a study demonstrated quantum advantages in small drug discovery when each component of the GAN was replaced with a variational quantum circuit (VQC). Consequently, the physicochemical properties and performance improvements were seen, with a minimal number of learnable parameters in the GAN's generator compared to the classical approach [43,44,45].

1.2. Variational Autoencoders (VAEs)

VAEs are a probabilistic generative model that learns a latent representation of data by encoding input samples into a lower-dimensional (encoder) and decoding them back into the original space (decoder). The encoder and decoder are the only components of this neural network structure, and they are trained in conjugation with each other, employing the reparameterization technique. The function of the Variational Autoencoder (VAE) is defined in equation 1, wherein an autoencoder setting, Q(z|X) and P(X|z), are estimated by an encoder and a decoder, respectively.
Preprints 101438 i001
Equation 1. Variational Autoencoder basic function [3].
Within drug discovery, VAEs have created a chemical latent space reflecting compound libraries' structural diversity [46,47,48]. These aids in exploring a wider chemical space and facilitates the generation of novel compounds. For example, the development of the Natural Product Compound Variational Autoencoder (NP-VAE) has enabled the handling of intricate datasets and large molecular structures, showcasing consistent performance as a generative model across multiple metrics. Employing reconstruction loss and latent loss, these models optimize reconstruction quality concurrently while exploring the latent space effectively [47,49,50]. Additionally, variants of VAEs have been used to accurately simulate morphology and gene expression readouts induced by specific compounds, allowing for the prediction of cell states affected by compounds with known polypharmacology. This inference of cell state based on drug mechanisms could assist researchers in the future by facilitating the development and identification of targeted therapeutics and the classification of off-target effects [51].

1.3. Transformer-Based Models

These AI models are adept at using Natural Language Processing, or NLP, to comprehend the structure and context of language. They are trained with extensive datasets to grasp connections between sequential data such as words and sentences. Three recently discovered, innovative approaches to drug discovery are presented here. The first, drugAI, integrates the Encoder–Decoder Transformer architecture with Reinforcement Learning via a Monte Carlo Tree Search to streamline the drug discovery process [52]. This method ensures the generation of valid small molecules with drug-like characteristics and robust binding affinities toward their targets. In the second approach, the authors focused more on target-specific de novo drug design, treating it as a translational challenge between the amino acid ‘language’ and simplified molecular input line entry system representation [53]. Employing the Transformer neural network architecture, known for its proficiency in sequence transduction tasks, this method captures long-range dependencies within sequences. It generates structurally diverse compounds with realistic properties within the plausible drug-like range. Finally, TransDTI is introduced by its authors as a multiclass classification and regression workflow utilizing transformer-based language models to categorize interactions between drug–target pairs as active, inactive, or intermediate. Trained on large-scale drug–target interaction datasets, these models exhibit superior performance compared to baseline methods, effectively predicting novel drug–target interactions from sequence data and outperforming existing approaches [54]. Another such model, DTSyn, utilizes its capability to extract interactions between chemicals and cell lines, depicting potential drug action mechanisms. Through the integration of attention mechanisms and pre-trained gene embeddings, DTSyn demonstrates enhanced interpretability. Consequently, this model is invaluable for prioritizing synergistic drug combinations based on chemical and cell line gene expression profiles. Similar other transformer-based models are proving to be instrumental in accelerating drug discovery processes by automating tasks like retrosynthesis, generating novel molecules with desired properties, and facilitating the exploration of chemical space for the development of new drugs [55,56,57,58,59].
Figure 1. Transformer model architecture [1].
Figure 1. Transformer model architecture [1].
Preprints 101438 g001

3. Restricted Boltzmann Machines (RBMs)

RBMs are a type of generative model based on energy-based models. They consist of visible and hidden layers with symmetric connections between them. RBMs are trained using contrastive divergence or other learning algorithms. RBMs have emerged as promising tools in drug discovery, specifically in forecasting drug-disease associations and drug-target interactions. In predicting drug repositioning tasks within drug-disease association networks, RBMs have displayed enhanced prediction performance compared to alternative methods. Moreover, augmenting the RBM model with momentum during weight updates has further bolstered prediction performance, positioning it as a potent tool for future drug repositioning endeavors [60,61,62].
In the domain of drug-target interactions, RBMs have been deployed to amalgamate multiple interaction types and forecast unknown drug-target relationships or modes of action. By formulating the prediction problem into a two-layer graphical model using RBMs, researchers have adeptly captured latent features within drug-target interaction networks, resulting in high precision-recall curve values. This methodology, surpassing other prediction techniques by incorporating various interaction types, holds practical significance in predicting drug-target interactions and advancing drug repositioning efforts [63,64,65]. RBMs are demonstrating their worth as invaluable assets in drug discovery by offering innovative solutions for forecasting drug-disease associations drug-target interactions and facilitating computational drug repositioning. However, research still needs to be done leveraging machine learning approaches to analyze intricate datasets, predict new relationships within biological systems, and shape the landscape of drug discovery.

4. Generative Graph Neural Networks (GNNs)

GNNs are a class of neural networks designed to operate on graph-structured data. Generative GNNs are capable of generating new graph structures or nodes and edges within a graph. GNNs are revolutionizing drug discovery by facilitating the creation of novel molecules with specific properties and streamlining the drug design process. These networks employ graph neural network modules to construct sequential molecular graph generators like MG2N2 [66,67]. Such generators incrementally add nodes and connections to graphs, simplifying training procedures and improving interpretability. By maximizing information input at each generative step, these models effectively generalize molecular patterns learned during training without succumbing to overfitting, demonstrating competitive performance in generating molecular structures [41]. Applications of GNNs in drug discovery are rapidly expanding, particularly in conditional de novo drug design. GNNs excel at processing graph-structured data and have played a pivotal role in predicting drug-target interactions and designing new candidate molecules efficiently [68,69]. The fusion of GNNs with deep learning techniques is revolutionizing graph generation for molecular structures, offering promising applications in drug discovery by optimizing resource utilization and improving the efficiency of generating new bioactive molecules. One such example is MM-GANN-DDI, which accurately presents a Multimodal Graph-Agnostic Neural Network designed to forecast occurrences of drug-drug interaction events [70,71,72,73]. GNNs drive innovation in drug discovery by enabling the systematic generation of novel molecules, predicting drug-target interactions, and advancing computational methods for de novo drug design. Their effectiveness in processing graph data and generating diverse chemical structures tailored for therapeutic purposes underscores their significant contribution to advancing drug discovery endeavors.

5. Language Models (LMs)

LMs are a class of generative models that learn the probability distribution of sequences of words or tokens in a language. LMs play a crucial role in drug discovery, providing innovative solutions to expedite the molecule discovery cycle, enhance de novo drug design, predict properties, and optimize chemical reactions [74]. Particularly, transformer-based architectures have showcased remarkable capabilities in comprehending and generating human-like text, extending their application into scientific domains such as protein folding, target identification, and small molecule design [75]. Within molecular discovery, Chemical LMs significantly contribute to accelerating the identification of new compounds for drug development, predicting properties, and optimizing chemical reactions [76]. These models operate on small molecules, proteins, or polymers, demonstrating promising results in early-stage drug discovery by effectively utilizing machine learning techniques to comprehend and generate scientific text [77]. Furthermore, AI-powered LM has revolutionized natural language processing (NLP) in drug discovery and development [78]. For instance, an automatic biomedical named entity recognition (BioNER) method finds the hidden relationship among chemicals, genes, targets, and diseases from text-based documents [79]. Henceforth, it can safely be said that LM models hold the potential to transform treatment development by assisting in target identification, clinical design, regulatory decision-making, pharmacovigilance, and even aiding in the development of new treatments for diseases like COVID-19 through drug repurposing initiatives.
LM models can streamline patient recruitment processes in clinical trials by automating tasks through advanced information retrieval and prioritization mechanisms. These models learn medical terms and their synonyms to extract valuable information from clinical documents, aiding in patient stratification based on disease subtypes. They also synthesize eligibility criteria into standardized contextual queries, improving clinical trial-matching processes. By leveraging cross-model learning infrastructure, these LMs encode enrolment criteria and patient electronic health records (EHRs) for enhanced matching inference, outperforming rule-based strategies. Furthermore, they seamlessly integrate with emerging technologies like genomics and imaging data to advance precision medicine. Additionally, AI-powered LMs facilitate higher patient enrolment rates and improved site identification, considering factors such as prior site experience, connections with health non-profits, patient retention data, and cost-effectiveness to support balanced clinical decision-making, which is the key to the success of the designed drugs [80,81,82,83].

6. Multimodal Models

Multimodal (MM) generative AI models can simultaneously process various types of data and are essential in drug discovery and therapeutic design, utilizing a combination of deep learning techniques. Deep generative models using multimodal data exhibit advantages over unimodal counterparts due to the complementary insights offered by multimodal data [84,85,86]. Successful drug discovery hinges on leveraging diverse data modalities that offer complementary perspectives, aiding in the triangulation of evidence for discovery. While current studies primarily focus on molecular structural data, they underutilize other data modalities, such as drug-target interactions, drug-disease knowledge, and relevant gene expression post-drug treatment. Addressing this challenge involves exploring solutions like "modality alignment," connecting all modalities through an intermediary modality, typically molecular structures, and "modality fusion," where all modalities are directly mapped to a common latent space [87,88,89,90]. This hybrid data model captures diverse information during drug design, including chemical properties, drug-target interactions, drug-disease knowledge, and disease-relevant gene expression [91]. Additionally, a multimodal generative model considers various components of the drug discovery pipeline to enhance the likelihood of success in clinical trials. By integrating structured and unstructured knowledge, frameworks like KEDD achieve a deeper understanding of biomolecules, outperforming state-of-the-art models in various predictions related to drug-target interactions, drug properties, drug-drug interactions, and protein-protein interactions [92]. Qualitative analysis reveals the promising potential of such frameworks in real-world applications, accelerating drug discovery by incorporating biomolecular expertise from multimodal knowledge.
MM models are also used to overcome the impractical size of chemical space; generative adversarial networks are proposed to generate diverse three-dimensional ligand shapes complementary to the pocket. These shapes can be decoded into a sequence of SMILES, enabling structure-based de novo drug design [93]. Evaluation shows enrichment compared to random sampling from the initial chemical space of ZINC drug-like compounds, validating the method's effectiveness in virtual screening. Moreover, integrated with several imaging techniques, multimodal imaging provides vast anatomical, functional, and molecular information, accelerating drug discovery and development processes. These imaging technologies aid in understanding disease mechanisms, identifying new pharmacological targets, and assessing potential drug candidates and treatment responses. Implementing radiomics MM via targeted and untargeted methods further enhances the utility of imaging technologies in drug discovery and development, emphasizing their strengths, innovations, and future potential. Targeted approaches involve imaging specific drug molecules or targets, while untargeted approaches analyze a wide range of molecules to discover drug metabolites, effects on endogenous molecules, and disease-related changes. These imaging techniques also unveil anatomical, structural, metabolomic, lipidomic, and proteomic alterations in response to drug treatments at tissue and organ levels, advancing drug design and delivery [94,95,96].

7. Drug Discovery and Digital Twins

Digital twins (DT) are virtual replicas or digital representations of physical objects, processes, systems, or entities. They are created using data collected from sensors, IoT devices, and other sources, and they mimic the behavior and characteristics of their real-world counterparts in a virtual environment. DTs are increasingly utilized in drug discovery to simulate drug behavior, predict efficacy, and streamline drug development processes. These digital counterparts empower researchers with a deeper understanding of how drugs interact with the body, enabling them to anticipate potential side effects and tailor dosages more effectively. Leveraging generative AI, digital twins can model systems ranging from individual cells to entire human bodies, thereby enhancing comprehension of diseases, facilitating biomarker discovery, and expediting drug development [21,23,24]. DT's various applications in the pharmaceutical industry include modeling cells to expedite drug discovery, forecasting patient responses to obviate placebo control arms in clinical trials, and facilitating personalized medicine by simulating organs, genomes, and patients. Furthermore, digital twins can augment drug delivery by fine-tuning drug release rates, dosages, and nanoparticle delivery efficiency [20,22,97]. With their ability to offer personalized treatment options, optimize drug delivery mechanisms, and accurately predict drug toxicity, digital twins hold significant promise in revolutionizing drug discovery processes and improving the efficiency, effectiveness, and safety of clinical trials while simultaneously reducing costs and time-to-market.
When the SARS-CoV-2 pandemic emerged in 2019, researchers quickly adapted epidemiological computer models for decision support in public health responses. However, existing tools could not predict individual COVID-19 patient outcomes. Patient-specific digital twins, akin to software replicas of engineered products, could integrate physiology, immunology, and real-time clinical data for predictive simulations. These digital twins, powered by AI, offer a promising tool against future pandemics, blending mechanistic knowledge with observational data [24,98].
DTs have been proposed to be used as avatars where individual simulations that match clinical criteria within a predefined margin of accuracy can be compared to real subjects. Avatars are particularly useful when an adequate population model is not feasible. Research focuses on generating avatars using pharmacometric models and exploring their properties to assess their impact on drug development stages. These avatars offer nuanced insights into a model's ability to simulate data similar to observations at both population and individual levels [99,100,101]. Additionally, they can serve as diagnostic tools, alternatives to simulations with insurance, and measures of model fit. In another instance, DTs are utilized in single-cell RNA sequencing (scRNA-seq) to analyze time-series data in inflammatory diseases, revealing complex multi-directional gene expression networks. This complexity complicates the prioritization of upstream regulators (URs) crucial for understanding disease mechanisms and identifying potential drug targets. To address this, a quantitative approach prioritizing URs based on their predicted effects on downstream target cells has been developed, proving effective in various inflammatory diseases [25,102]. DTs are employed in high-throughput drug discovery (HDT) to enhance efficiency and reduce costs. HDT technology virtually represents organs, organ systems, and whole patients, informing target selection, drug delivery, and clinical trial design. DTs enable granular modeling of biological processes, facilitating target discovery and allowing exploration of multiple targets for specific disease states.
Additionally, DTs replicate in vivo conditions in drug delivery to optimize solid-dosage drug parameters, decreasing costs and increasing manufacturing speed. Moreover, DTs partially virtualize control arms in clinical trials, reducing the number of physical patients needed and accelerating trial timelines, thus saving costs and expediting drug development [103,104,105].
DTs are also increasingly recognized for their potential to revolutionize various aspects of healthcare, particularly in clinical settings and drug development. These virtual replicas enable the generation of entire and realistic clinical patient trajectories, addressing the pressing need to expedite drug development processes. With only one out of ten compounds entering clinical trials achieving regulatory approval, the efficiency of phase 1 clinical trials becomes paramount. These trials aim to ascertain the efficacy and safety of compounds based on patient data, yet around 80% of them face delays due to patient enrolment issues. DTs offer a solution by augmenting clinical trials with patient replicas, significantly accelerating timelines and enhancing quality. By leveraging DT-generated data, long patient recruitment processes can be minimized, particularly in rare conditions or oncology trials where DTs simulate comparator arms, enabling earlier efficacy assessments. Ultimately, DTs increase statistical power through simulated data, expediting clinical decision-making processes [22]. Expanding beyond their traditional application in manufacturing, DTs hold promise as integrative systems that incorporate information from diverse scientific and clinical sources to represent complex biological networks. A notable example is the development of a digital twin of the liver, integrating knowledge gleaned from studying various liver functions, diseases, and drug effects. Based on a mathematical framework of ordinary differential equations, this twin effectively reproduces normal liver function, disease progression, and treatment impacts. Moreover, coupling the twin with experimental measurements provides valuable insights into drug-induced liver injury. This approach, applicable to other organs and biological systems, offers a generalizable strategy to enhance drug development efficiency and safety across diverse therapeutic areas [106].

8. Conclusion

The advancement of generative AI has significantly transformed the landscape of drug discovery, small molecule design, and clinical trials. With various model types tailored to different tasks, such as molecular generation, property optimization, and target identification, generative AI offers unprecedented efficiency and precision in drug development processes. Furthermore, integrating digital twins has revolutionized drug testing and development by providing virtual representations of patients, allowing for more accurate predictions of drug responses and potential side effects. Looking ahead, future research in this domain could explore enhanced synergies between generative AI and digital twins, potentially paving the way for personalized medicine on a scale previously unimaginable. Additionally, there is scope for deeper exploration into ethical considerations, regulatory frameworks, and the democratization of these technologies to ensure equitable access and responsible implementation in healthcare systems worldwide. As we continue to harness the power of generative AI and digital twins, the possibilities for innovation in pharmaceutical research and development are boundless, promising a future of improved patient outcomes and transformative medical discoveries.

Funding

None.

Conflicts of Interest

No conflict of interest is declared by any author.

Roles

SKN: concept, structure; ZM: research, writing; MM: Research, review, and writing.

References

  1. Ashish Vaswani, N.S., Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, Attention Is All You Need. arixv, 2017: p. 11.
  2. Turing, A.M., COMPUTING MACHINERY AND INTELLIGENCE. Mind, 1950. 59: p. 433-460.
  3. Lim, J., et al., Molecular generative model based on conditional variational autoencoder for de novo molecular design. J Cheminform, 2018. 10(1): p. 31. [CrossRef]
  4. Artificial Intelligence Coined at Dartmouth. Available from: https://home.dartmouth.edu/about/artificial-intelligence-ai-coined-dartmouth.
  5. Gio Wiederhold, J.M., Arthur Samuel: Pioneer in Machine Learning. 1992.
  6. Natale, S., The ELIZA Effect: Joseph Weizenbaum and the Emergence of Chatbots. Deceitful Media: Artificial Intelligence and Social Life after the Turing Test, 2021.
  7. Bian, Y., et al., Deep Convolutional Generative Adversarial Network (dcGAN) Models for Screening and Design of Small Molecules Targeting Cannabinoid Receptors. Mol Pharm, 2019. 16(11): p. 4451-4460. [CrossRef]
  8. Parrot, M., et al., Integrating synthetic accessibility with AI-based generative drug design. J Cheminform, 2023. 15(1): p. 83. [CrossRef]
  9. Smith, L.B. and H. Karmazyn-Raz, Episodes of experience and generative intelligence. Trends Cogn Sci, 2022. 26(12): p. 1064-1065. [CrossRef]
  10. Dong Liu, A.H., Saikat Chatterjee, Lars K. Rasmussen, Powering Hidden Markov Model by Neural Network based Generative Models. 2019.
  11. Yihan Cao, S.L., Yixin Liu, Zhiling Yan, Yutong Dai, Philip S. Yu, Lichao Sun, A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT. 2023.
  12. Lendasse, E.E.A., Gaussian Mixture Models for Time Series Modelling, Forecasting, and Interpolation. 2013.
  13. Jeong, J.J., et al., Systematic Review of Generative Adversarial Networks (GANs) for Medical Image Classification and Segmentation. J Digit Imaging, 2022. 35(2): p. 137-152. [CrossRef]
  14. Skandarani, Y., et al., Generative Adversarial Networks in Cardiology. Can J Cardiol, 2022. 38(2): p. 196-203. [CrossRef]
  15. Motamed, S., P. Rogalla, and F. Khalvati, Data augmentation using Generative Adversarial Networks (GANs) for GAN-based detection of Pneumonia and COVID-19 in chest X-ray images. Inform Med Unlocked, 2021. 27: p. 100779.
  16. Goulas, A., F. Damicelli, and C.C. Hilgetag, Bio-instantiated recurrent neural networks: Integrating neurobiology-based network topology in artificial networks. Neural Netw, 2021. 142: p. 608-618. [CrossRef]
  17. Hossain, E., et al., Natural Language Processing in Electronic Health Records in relation to healthcare decision-making: A systematic review. Comput Biol Med, 2023. 155: p. 106649. [CrossRef]
  18. Nath, S., et al., New meaning for NLP: the trials and tribulations of natural language processing with GPT-3 in ophthalmology. Br J Ophthalmol, 2022. 106(7): p. 889-892. [CrossRef]
  19. Jungmann, F., S. Kuhn, and B. Kampgen, [Basics and applications of Natural Language Processing (NLP) in radiology].. Radiologe, 2018. 58(8): p. 764-768.
  20. An, G. and C. Cockrell, Drug Development Digital Twins for Drug Discovery, Testing and Repurposing: A Schema for Requirements and Development. Front Syst Biol, 2022. 2. [CrossRef]
  21. Bjornsson, B., et al., Digital twins to personalize medicine. Genome Med, 2019. 12(1): p. 4. [CrossRef]
  22. Bordukova, M., et al., Generative artificial intelligence empowers digital twins in drug discovery and clinical trials. Expert Opin Drug Discov, 2024. 19(1): p. 33-42. [CrossRef]
  23. Croatti, A., et al., On the Integration of Agents and Digital Twins in Healthcare. J Med Syst, 2020. 44(9): p. 161. [CrossRef]
  24. Laubenbacher, R., J.P. Sluka, and J.A. Glazier, Using digital twins in viral infection. Science, 2021. 371(6534): p. 1105-1106. [CrossRef]
  25. Li, X., et al., A dynamic single cell-based framework for digital twins to prioritize disease genes and drug targets. Genome Med, 2022. 14(1): p. 48. [CrossRef]
  26. Ward, T.M., et al., Computer vision in surgery. Surgery, 2021. 169(5): p. 1253-1256. [CrossRef]
  27. Kather, J.N., et al., Medical domain knowledge in domain-agnostic generative AI. NPJ Digit Med, 2022. 5(1): p. 90. [CrossRef]
  28. Xiao, Z., et al., Generative Artificial Intelligence GPT-4 Accelerates Knowledge Mining and Machine Learning for Synthetic Biology. ACS Synth Biol, 2023. 12(10): p. 2973-2982. [CrossRef]
  29. .
  30. (MGI), M.G.I. Generative AI in the pharmaceutical industry: Moving from hype to reality. 2024; Available from: https://www.mckinsey.com/industries/life-sciences/our-insights/generative-ai-in-the-pharmaceutical-industry-moving-from-hype-to-reality#/.
  31. Wenzel, M., Generative Adversarial Networks and Other Generative Models, in Machine Learning for Brain Disorders, O. Colliot, Editor. 2023: New York, NY. p. 139-92.
  32. Li, C., et al., Triple Generative Adversarial Networks. IEEE Trans Pattern Anal Mach Intell, 2022. 44(12): p. 9629-9640. [CrossRef]
  33. Zhong, G., et al., Generative adversarial networks with decoder-encoder output noises. Neural Netw, 2020. 127: p. 19-28. [CrossRef]
  34. Blanchard, A.E., C. Stanley, and D. Bhowmik, Using GANs with adaptive training data to search for new molecules. J Cheminform, 2021. 13(1): p. 14. [CrossRef]
  35. Yu, H., K. Li, and J. Shi, DGANDDI: Double Generative Adversarial Networks for Drug-Drug Interaction Prediction. IEEE/ACM Trans Comput Biol Bioinform, 2023. 20(3): p. 1854-1863. [CrossRef]
  36. Hussain, S., et al., High-content image generation for drug discovery using generative adversarial networks. Neural Netw, 2020. 132: p. 353-363. [CrossRef]
  37. Bian, Y. and X.Q. Xie, Generative chemistry: drug discovery with deep learning generative models. J Mol Model, 2021. 27(3): p. 71. [CrossRef]
  38. Tong, X., et al., Generative Models for De Novo Drug Design. J Med Chem, 2021. 64(19): p. 14011-14027. [CrossRef]
  39. Lin, E., C.H. Lin, and H.Y. Lane, De Novo Peptide and Protein Design Using Generative Adversarial Networks: An Update. J Chem Inf Model, 2022. 62(4): p. 761-774. [CrossRef]
  40. Macedo, B., I. Ribeiro Vaz, and T. Taveira Gomes, MedGAN: optimized generative adversarial network with graph convolutional networks for novel molecule design. Sci Rep, 2024. 14(1): p. 1212. [CrossRef]
  41. Zhu, Z.Z.P.C.W., Deep Learning on Graphs: A Survey. IEEE Pulse, 2022: p. 249 - 270.
  42. Lin, E., C.H. Lin, and H.Y. Lane, Relevant Applications of Generative Adversarial Networks in Drug Design and Discovery: Molecular De Novo Design, Dimensionality Reduction, and De Novo Peptide and Protein Design. Molecules, 2020. 25(14).
  43. Kao, P.Y., et al., Exploring the Advantages of Quantum Generative Adversarial Networks in Generative Chemistry. J Chem Inf Model, 2023. 63(11): p. 3307-3318. [CrossRef]
  44. Niu, M.Y., et al., Entangling Quantum Generative Adversarial Networks. Phys Rev Lett, 2022. 128(22): p. 220505. [CrossRef]
  45. Tian, J., et al., Recent Advances for Quantum Neural Networks in Generative Learning. IEEE Trans Pattern Anal Mach Intell, 2023. 45(10): p. 12321-12340. [CrossRef]
  46. Marino, J., Predictive Coding, Variational Autoencoders, and Biological Connections. Neural Comput, 2021. 34(1): p. 1-44. [CrossRef]
  47. Zhang, Y., et al., Drug-protein interaction prediction via variational autoencoders and attention mechanisms. Front Genet, 2022. 13: p. 1032779. [CrossRef]
  48. Li, T., X.M. Zhao, and L. Li, Co-VAE: Drug-Target Binding Affinity Prediction by Co-Regularized Variational Autoencoders. IEEE Trans Pattern Anal Mach Intell, 2022. 44(12): p. 8861-8873. [CrossRef]
  49. Huang, Z., S. Chen, and L. Yu, Predicting new drug indications based on double variational autoencoders. Comput Biol Med, 2023. 164: p. 107261. [CrossRef]
  50. Ochiai, T., et al., Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity. Commun Chem, 2023. 6(1): p. 249. [CrossRef]
  51. Chow, Y.L., et al., Predicting drug polypharmacology from cell morphology readouts using variational autoencoder latent space arithmetic. PLoS Comput Biol, 2022. 18(2): p. e1009888.
  52. Ang, D., C. Rakovski, and H.S. Atamian, De Novo Drug Design Using Transformer-Based Machine Translation and Reinforcement Learning of an Adaptive Monte Carlo Tree Search. Pharmaceuticals (Basel), 2024. 17(2). [CrossRef]
  53. Grechishnikova, D., Transformer neural network for protein-specific de novo drug generation as a machine translation problem. Sci Rep, 2021. 11(1): p. 321. [CrossRef]
  54. Kalakoti, Y., S. Yadav, and D. Sundar, TransDTI: Transformer-Based Language Models for Estimating DTIs and Building a Drug Recommendation Workflow. ACS Omega, 2022. 7(3): p. 2706-2717. [CrossRef]
  55. Shiju, A. and Z. He, Classifying Drug Ratings Using User Reviews with Transformer-Based Language Models. IEEE Int Conf Healthc Inform, 2022. 2022: p. 163-169.
  56. Zhang, S., et al., Applications of transformer-based language models in bioinformatics: a survey. Bioinform Adv, 2023. 3(1): p. vbad001. [CrossRef]
  57. Jiang, L., et al., DeepTTA: a transformer-based model for predicting cancer drug response. Brief Bioinform, 2022. 23(3). [CrossRef]
  58. Hu, J., et al., DTSyn: a dual-transformer-based neural network to predict synergistic drug combinations. Brief Bioinform, 2022. 23(5). [CrossRef]
  59. Mao, J., et al., Transformer-Based Molecular Generative Model for Antiviral Drug Design. J Chem Inf Model, 2023. [CrossRef]
  60. Hugo Larochelle, P., Classification using discriminative restricted Boltzmann machines.
  61. Max Welling, G.E.H., A New Learning Algorithm for Mean Field Boltzmann Machines.
  62. Max Welling, G.E.H., Restricted Boltzmann machines for collaborative filtering. 2002.
  63. Wang, Y. and J. Zeng, Predicting drug-target interactions using restricted Boltzmann machines. Bioinformatics, 2013. 29(13): p. i126-34. [CrossRef]
  64. Qian, Y., et al., Identification of drug-side effect association via restricted Boltzmann machines with penalized term. Brief Bioinform, 2022. 23(6). [CrossRef]
  65. Cheng, X., et al., Neighborhood-based inference and restricted Boltzmann machine for microbe and drug associations prediction. PeerJ, 2022. 10: p. e13848. [CrossRef]
  66. Bongini, P., et al., A Deep Learning Approach to the Prediction of Drug Side-Effects on Molecular Graphs. IEEE/ACM Trans Comput Biol Bioinform, 2023. 20(6): p. 3681-3690. [CrossRef]
  67. Wu, Z., et al., A Comprehensive Survey on Graph Neural Networks. IEEE Trans Neural Netw Learn Syst, 2021. 32(1): p. 4-24.
  68. Carlo Abate, S.D., Andrea Cavalli, Graph neural networks for conditional de novo drug design. Wiley Computational Molecular Science, 2023.
  69. Jie Zhou, G.C., Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, Maosong Sun, Graph neural networks: A review of methods and applications. 2020. [CrossRef]
  70. Xiong, J., et al., Graph neural networks for automated de novo drug design. Drug Discov Today, 2021. 26(6): p. 1382-1393. [CrossRef]
  71. Feng, J., Y. Liang, and T. Yu, MM-GANN-DDI: Multimodal Graph-Agnostic Neural Networks for Predicting Drug-Drug Interaction Events. Comput Biol Med, 2023. 166: p. 107492. [CrossRef]
  72. D'Souza, S., P. Kv, and S. Balaji, Training recurrent neural networks as generative neural networks for molecular structures: how does it impact drug discovery? Expert Opin Drug Discov, 2022. 17(10): p. 1071-1079.
  73. Xia, X., et al., Graph-based generative models for de Novo drug design. Drug Discov Today Technol, 2019. 32-33: p. 45-53. [CrossRef]
  74. Moret, M., et al., Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. Nat Commun, 2023. 14(1): p. 114. [CrossRef]
  75. Schenone, M., et al., Target identification and mechanism of action in chemical biology and drug discovery. Nat Chem Biol, 2013. 9(4): p. 232-40. [CrossRef]
  76. Nikita Janakarajan, T.E., Sarath Swaminathan, Teodoro Laino, Jannis Born, Language models in molecular discovery. 2023.
  77. Bajorath, J., Chemical language models for molecular design. Mol Inform, 2024. 43(1): p. e202300288. [CrossRef]
  78. Liu, Z., et al., AI-based language models powering drug discovery and development. Drug Discov Today, 2021. 26(11): p. 2593-2607. [CrossRef]
  79. Giorgi, J.M. and G.D. Bader, Towards reliable named entity recognition in the biomedical domain. Bioinformatics, 2020. 36(1): p. 280-286. [CrossRef]
  80. Blanco, A., et al., Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity. Comput Methods Programs Biomed, 2020. 188: p. 105264. [CrossRef]
  81. Dias, R. and A. Torkamani, Artificial intelligence in clinical and genomic diagnostics. Genome Med, 2019. 11(1): p. 70. [CrossRef]
  82. Hall, J.L., et al., Merging Electronic Health Record Data and Genomics for Cardiovascular Research: A Science Advisory From the American Heart Association. Circ Cardiovasc Genet, 2016. 9(2): p. 193-202.
  83. Harrer, S., et al., Artificial Intelligence for Clinical Trial Design. Trends Pharmacol Sci, 2019. 40(8): p. 577-591. [CrossRef]
  84. Xiangxiang Zeng , F.W., Yuan Luo , Seung-gu Kang , Jian Tang , Felice C. Lightstone , Evandro F. Fang , Wendy Cornell , Ruth Nussinov , Feixiong Cheng, Deep generative molecular design reshapes drug discovery. 2022. [CrossRef]
  85. Y. Luo, A.E., N. Palmer, P. Avillach, A. Levy-Moonshine, P. Szolovits, I.S. Kohane, A multidimensional precision medicine approach identifies an autism subtype characterized by dyslipidemia. Nat. Med., 2020. 26: p. 1375-1379. [CrossRef]
  86. Shengchao Liu, H.W., Weiyang Liu, Joan Lasenby, Hongyu Guo, Jian Tang, Pre-training Molecular Graph Representation with 3D Geometry. 2021.
  87. M. Manica, A.O., J. Born, V. Subramanian, J. Sáez-Rodríguez, M. Rodríguez Martínez, Toward explainable anticancer compound sensitivity prediction via multimodal attention-based convolutional encoders. Mol. Pharm., 2019: p. 4797-4806. [CrossRef]
  88. Wengong Jin, K.Y., Regina Barzilay, Tommi Jaakkola, Learning Multimodal Graph-to-Graph Translation for Molecular Optimization. 2018.
  89. Baltrusaitis, T., C. Ahuja, and L.P. Morency, Multimodal Machine Learning: A Survey and Taxonomy. IEEE Trans Pattern Anal Mach Intell, 2019. 41(2): p. 423-443. [CrossRef]
  90. Baltrusaitis, T.; Ahuja, C.; Morency, L.-P. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 423–443. [Google Scholar] [CrossRef]
  91. Steyaert, S., Pizurica, M., Nagaraj, D. et al. , Multimodal data fusion for cancer biomarker discovery with deep learning. Nat Mach Intell, 2023: p. 351-362. [CrossRef]
  92. YIZHEN LUO, X.Y.L., KAI YANG, KUI HUANG, MASSIMO HONG, JIAHUAN ZHANG, YUSHUAI WU, ZAIQING NIE Toward Unified AI Drug Discovery with Multimodal Knowledge. 2024.
  93. Skalic, M., et al., From Target to Drug: Generative Modeling for the Multimodal Structure-Based Ligand Design. Mol Pharm, 2019. 16(10): p. 4282-4291. [CrossRef]
  94. Buchberger, A.R., et al., Mass Spectrometry Imaging: A Review of Emerging Advancements and Future Insights. Anal Chem, 2018. 90(1): p. 240-265. [CrossRef]
  95. Loscher, W., Single-Target Versus Multi-Target Drugs Versus Combinations of Drugs With Multiple Targets: Preclinical and Clinical Evidence for the Treatment or Prevention of Epilepsy. Front Pharmacol, 2021. 12: p. 730257. [CrossRef]
  96. Marecek, R., et al., Automated fusion of multimodal imaging data for identifying epileptogenic lesions in patients with inconclusive magnetic resonance imaging. Hum Brain Mapp, 2021. 42(9): p. 2921-2930. [CrossRef]
  97. Laubenbacher, R., et al., Building digital twins of the human immune system: toward a roadmap. NPJ Digit Med, 2022. 5(1): p. 64. [CrossRef]
  98. Cockrell, C. and G. An, Utilizing the Heterogeneity of Clinical Data for Model Refinement and Rule Discovery Through the Application of Genetic Algorithms to Calibrate a High-Dimensional Agent-Based Model of Systemic Inflammation. Front Physiol, 2021. 12: p. 662845. [CrossRef]
  99. Polasek, T.M. and A. Rostami-Hodjegan, Virtual Twins: Understanding the Data Required for Model-Informed Precision Dosing. Clin Pharmacol Ther, 2020. 107(4): p. 742-745. [CrossRef]
  100. Patel, N., et al., Real Patient and its Virtual Twin: Application of Quantitative Systems Toxicology Modelling in the Cardiac Safety Assessment of Citalopram. AAPS J, 2017. 20(1): p. 6. [CrossRef]
  101. Chasseloup, E., A.C. Hooker, and M.O. Karlsson, Generation and application of avatars in pharmacometric modelling. J Pharmacokinet Pharmacodyn, 2023. 50(5): p. 411-423. [CrossRef]
  102. Shalek, A.K. and M. Benson, Single-cell analyses to tailor treatments. Sci Transl Med, 2017. 9(408). [CrossRef]
  103. Venkatesh, K.P., G. Brito, and M.N. Kamel Boulos, Health Digital Twins in Life Science and Health Care Innovation. Annu Rev Pharmacol Toxicol, 2024. 64: p. 159-170. [CrossRef]
  104. Fogel, D.B., Factors associated with clinical trials that fail and opportunities for improving the likelihood of success: A review. Contemp Clin Trials Commun, 2018. 11: p. 156-164. [CrossRef]
  105. Schutt, M., et al., Development of a digital twin of a tablet that mimics a real solid dosage form: Differences in the dissolution profile in conventional mini-USP II and a biorelevant colon model. Eur J Pharm Sci, 2022. 179: p. 106310. [CrossRef]
  106. Subramanian, K., Digital Twin for Drug Discovery and Development—The Virtual Liver. 2020. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Downloads

276

Views

161

Comments

0

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Email

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated