Computational Advancements in Drug Repurposing for Cancer Combination Therapy Prediction

Victoria L. Flanary; Jennifer L. Fisher; Elizabeth J. Wilk; Timothy C. Howton; Brittany N. Lasseigne

doi:10.20944/preprints202305.1637.v1

Submitted:

22 May 2023

Posted:

23 May 2023

Read the latest preprint version here

Abstract

As cancer remains resistant to several modes of treatment, novel therapeutics are still under active investigation to overcome treatment inefficacy in cancer. Given the high attrition rate of de novo drug discovery, drug screening, and drug repurposing have offered time- and cost-effective alternative strategies for the identification of potentially effective therapeutics. In contrast to large-scale drug screens, computational approaches for drug repurposing leverage the increasing amounts of biomedical data to predict candidate therapeutic agents prior to testing in biological models. Current studies in drug repurposing for cancer therapy prediction have increasingly focused on the prediction of combination therapies, as combination therapies have numerous advantages over monotherapies. These include increased effect from synergistic interactions, reduced toxicity from lowered drug doses, and a reduced risk of resistance due to multiple non-overlapping mechanisms of action. This review provides a summary of several classes of computational methods used for drug combination therapy prediction in cancer research, including networks, regression-based machine learning, classifier machine learning models, and deep learning approaches, with the goal of presenting current progress in the field, particularly to non-computational cancer biologists. We conclude by discussing the need for further advancements in technologies that incorporate disease mechanisms, drug characteristics, multi-omics data, and clinical considerations to generate effective patient-specific drug combinations, as holistic data integration will inevitably result in optimal targeted therapeutics for cancer.

Keywords:

computational

;

drug repurposing

;

drug repositioning

;

cancer

;

combination therapy

;

network biology

;

machine learning

;

deep learning

;

precision oncology

Subject:

Medicine and Pharmacology - Oncology and Oncogenics

Introduction

Although novel therapeutic approaches have revolutionized the treatment of cancer, cancer remains highly resistant to current therapies, with drug resistance and resultant treatment inefficacy responsible for upward to 90% of cancer-related deaths [1,2,3,4,5]. Given the high attrition rate of de novo drug discovery, with approximately 90% of drugs failing to pass phase I clinical trials despite average investments of $3 billion across 13-15 years per compound, many researchers have turned to drug repositioning as a time- and cost-effective alternative [6,7,8]. Drug repositioning, also referred to as drug repurposing, is the practice of identifying novel indications for already FDA-approved drugs [9]. Many well-known instances of drug repositioning have occurred as incidental clinical findings. One example of this is thalidomide, a drug once advertised to treat morning sickness in pregnant women, that was later found to have therapeutic effects in multiple myeloma and metastatic prostate cancer [10,11,12]. However, these serendipitous findings are rare, and as such, many have turned to phenotypic drug screening or computational drug repositioning to systematically predict effective drug candidates from large-scale drug and disease datasets [9]. While in vitro drug screening assays have the advantage of providing direct knowledge of drug effects across various dosages within a biological disease model, computational drug repurposing methods can leverage the ever-growing amount of available biomedical data to identify novel drug candidates [9,13].

Most drug repurposing studies have focused on the identification of single agents to treat a target disease. However, for complex and heterogeneous diseases like cancer, it is unlikely that a single agent can target every cancer cell or relevant disease pathway, thus resulting in residual treatment-resistant cells that can lead to tumor recurrence. Therefore, recent advances in drug repositioning efforts for cancer have prioritized the prediction of multi-targeted combination therapies to both enhance treatment efficacy and reduce the risk of monotherapy resistance [14,15]. Combination therapies also reduce the toxicity of treatment regimens by maintaining therapeutic efficacy at lower doses of individual agents within the drug regimen [15].

Computational drug combination prediction approaches have generally been designed to achieve one of two goals: to target parallel disease pathways or to maximize computational scores of therapeutic efficacy [16]. As opposed to the multi-pathway approach, which aims to identify non-overlapping targets to maximize the efficacy and minimize the toxicity of candidate drug treatments, computational score-based approaches aim to maximize either the synergy or sensitivity scores of a drug combination. Sensitivity refers to the degree of treatment response measured by the percent inhibition of cell viability or growth in in vitro experiments [17]. Synergy is a type of drug-drug interaction in which the effect of a drug combination is greater than the additive effect of individual drugs in the combination [17]. This is the most common goal for drug combination prediction approaches, as achieving synergistic interactions may maximize the efficacy of drug treatments. Synergy is measured by several metrics, including Loewe additivity, Bliss independence, highest single agent (HSA), and the Chou-Talalay method, all of which have been concisely reviewed by Pemovska et al. [18].

This review will discuss computational drug combination prediction methods that aim to identify effective multi-drug regimens for cancer across both of these goals (Table 1). First, we describe various networks-based methods that examine the relationships between biological entities, such as genes, proteins, pathways, and phenotypes, to design combinatorial therapies that target optimal disease- and drug-related elements. Then, we review regression-based machine learning models that predict missing dose response values (i.e. sensitivity) for synergy calculations in drug pairs and multi-drug cocktails. This is followed by a review of classifier-based machine learning models capable of predicting new drug combinations through both drug targets and synergy calculations. The last category of computational approaches described in this review is deep learning methods, which build upon the previously described machine learning models to tackle larger and more complex data. We conclude this review with suggestions for future directions by which computational drug combination prediction methods may be improved to enhance their utility and translation to the clinic.

Networks-Based Models

Given the degree of intercellular heterogeneity in cancer, it is helpful to understand how various molecular entities within cancer cells and their microenvironment interact with one another. Networks are graphical models that show associations between molecular entities within complex systems. These networks consist of nodes, which are units that symbolize molecules like proteins or larger concepts like phenotypes, and edges, which are lines that depict relationships between the nodes they connect. There are different categories of networks that represent various types of relationships between different entities. For example, protein-protein interaction networks depict unique proteins as nodes and the physical interactions between proteins as edges. Edges can also be weighted to quantify relationships between nodes in a network. Several network analysis algorithms, such as information propagation procedures, have been developed to take advantage of these weighted edges and provide further information on systems of interest. The types of networks that are most commonly used to identify novel drug repurposing candidates include drug-disease, drug-target, drug-drug, and protein-protein interaction networks [19]. These networks can be integrated into heterogeneous, multi-omic graphs to reveal how therapeutic agents interact with biological systems through their direct targets. Additionally, these systems-level interactions can predict therapeutic mechanisms of action, adverse events, and alternative applications of FDA-approved drugs and potentially synergistic drug combinations.

Disease modules, which are subnetworks within a biological graph enriched in genes that are associated with disease etiology and progression, may be utilized in network-based drug combination repurposing [20]. Some network-based methods for drug combination prediction use proximity (i.e.,nearest distance between nodes of interest) of drug targets to disease modules within a network to predict candidate therapies. This is based on the premise that drug combinations with targets contained within the same disease module will have increased efficacy. Cheng et al. created a separation metric to determine the distance between the drug-drug-disease modules [22]. After comparing six classes of drug-drug-disease relationship combinations, they found that the only class that correlated with therapeutic effect was complementary exposure, or the situation in which two drugs’ targets overlap that disease module within the interactome, but not with each other [21]. In a subsequent work, Federico et al. generated integrated disease networks by combining protein-protein interactions, gene co-expression, and gene regulation. They then prioritized drug combinations for five cancers (breast cancer, hepatocellular carcinoma, prostate adenocarcinoma, stomach adenocarcinoma, colon adenocarcinoma, and lung adenocarcinoma) by accounting for the drugs’ mechanisms of action, secondary structures, and the drug targets’ shortest paths within the network. Drugs whose targets were directly connected were deprioritized due to their overlapping neighborhood area based on the work by Cheng et al. This study also introduces the druggability map, a unique graphical instrument to prioritize drug repositioning candidates through the incorporation of both drug and disease characteristics.

Although the groups discussed above used network-based approaches to model general drug and disease entities to predict candidate combination therapies, another group leveraged patient-derived drug response to prioritize drug repurposing candidates. Jafari et al. used the Beat AML dataset, a cohort of 672 acute myeloid leukemia samples screened for sensitivity to 122 drugs, to generate two bipartite networks: a patient similarity network and a drug similarity network [24]. Analysis of clusters within the patient similarity network found characteristics and relationships that were used to account for patient heterogeneity in downstream analyses. The drug similarity network contained two distinct clusters of small molecules. They reasoned that designing combination therapies by combining the top candidates from each cluster of the drug similarity network into drug pairs may prevent drug resistance and cancer recurrence. Synergy analysis of these inter-cluster drug combinations in 135 drug-drug-cell line triplicates validated their model’s predicted regimens as highly synergistic across multiple synergy metrics, including Loewe additivity, Bliss independence, HSA, and ZIP.

Given recent revelations on the nature of cancer cell plasticity from single-cell RNA sequencing studies, a recent publication from Sarmah et al. aimed to predict drug combination responses using a temporal cell state network model [25]. The authors explored the possibility that the types of cancer cells within a tumor (i.e. the different cell states across cancer cell populations), the speed by which cell state transitions occur, and how drugs affect those transitions may provide valuable information on drug combination response to therapy. They explore this hypothesis by testing three kinase inhibitors that each target a different cell cycle transition in vitro. They used a Markov model to gauge cell growth and single-agent drug sensitivities and then used this model to predict combinatorial drug responses. They calculated synergy using an excess over Bliss analysis, where drug synergy is defined by the observed drug response greater than that found by totaling individual drug sensitivities. Their results suggest that cell state transition dynamics and prior drug response knowledge may inform the response to drug combination therapies.

Regression-Based Machine Learning Models

Regression-based machine learning models are often used in combination with prior knowledge of known drug sensitivities to predict unknown drug responses or to predict responses to drug combinations. These models predict outcomes based on whether a mathematical relationship exists between an independent and a dependent variable, with the most basic of these models fitting to a linear relationship. Linear models have previously been used to reduce technical noise during the production of more robust models, to create full dose-response matrices (matrices that include all dose pairs for a drug combination pair over a desired concentration range) by predicting missing dosages, and to predict synergistic interactions [28]. Although these matrices are required for many synergy calculations, they are difficult to acquire, as manual drug testing becomes costly and impractical with numerous combinations and their replicates across various dosages [28].

One linear model by Amzallag et al. aimed to reduce the noise produced in drug synergy prediction algorithms when the single agent data used for these calculations was captured incorrectly or incompletely [28]. The authors generated a dataset of 439,000 drug response data points from testing all pairwise combinations of 108 drugs across 40 melanoma cell lines. They then applied a linear model based on the Bliss independence synergy metric (which assumes that the effect of a combination of drugs is equal to the product of the individual drugs) to all cell lines in their dataset. They found that both single agent sensitivity values and synergy values showed significantly high correlations from their linear model, and they were able to differentiate true synergistic interactions from instances of potentiation, where the addition of one drug enhances the effect of another while not directly contributing to the effect itself, using a specificity score.

Alternatively, Zimmer et al. integrated Bliss independence with a regression model to create the pairs model, which requires relatively few experiments to estimate the effect of multi-drug cocktails [47]. They expanded upon the Bliss formula by employing drug response data of drug pairs to predict the effects of higher-order combinations that contain more than two drugs, as they had found that the interactions between pairs of drugs often predicted the overall effect of the regimens in which those pairs were included. Briefly, the formula for the pairs model smoothly converts between Bliss independence and logarithmic regression based on a parameter that defaults to only calculating by Bliss independence when equal to 0 and to only by the logarithmic-linear regression model when equal to 1. Any parameter value between 0 and 1 would interpolate between the Bliss and regression algorithms. This parameter is then adjusted based on the number of drugs in the desired drug combination (or cocktail), allowing for high-order drug combinations of up to 6 drugs, while only supplying drug pairs data as input.

Whereas the previously mentioned regression-based models assume that drug interactions fit a linear relationship by relying on the Bliss independence metric, Bayesian regression can be applied to optimize drug response predictions by assuming drug interactions have nonlinear relationships [30]. Bayesian regression allows for the incorporation of uncertainty into models by estimating probability distributions over parameters, as opposed to using point estimates of parameters to make predictions like linear models. The R programming package Keyboard is a Bayesian regression-based approach developed to derive maximum tolerated doses, optimal dose increases and decreases, and optimal biological doses for single drug and drug combination experiments from clinical trial data [34]. Keyboard combines three previously developed Bayesian-based drug prediction methods into its algorithms [31,32,33]. To predict candidate drug combinations, it considers the drug response profile of a patient cohort to a drug combination at two different dose combinations. It then predicts the maximum toxicity interval based on the updated data from the distribution of the second dose combination. This information allows the model to either increase or decrease doses with each new cohort added to the calculations, which iteratively updates the maximum toxicity interval prediction based on the updated posterior distribution.

Classifier-Based Machine Learning Models

Whereas machine learning regression-based models aim to predict drug combinations and their interactions by assuming these interactions fit a mathematical relationship, classifier-based approaches specify mathematical boundaries that classify observations into specific categories based on whether they fit into the categories’ specified ranges (e.g., classifying a drug interaction type as additive, synergistic, or antagonistic) [17,48]. In the context of cancer drug combination prediction, these models have been applied to classify drug combination synergy via multiple modalities, including logistic regression, support vector machines, and decision trees.

Iwata et al. used a logistic regression model that incorporated target proteins and anatomical therapeutic chemical codes to predict potentially effective drug combinations for cancer [36]. Logistic regression models are probabilistic classifiers that determine the probability that a new observation will fall into one of a finite number of categories [17,48]. Iwata et al. used approved drug combinations from the FDA Orange Book and KEGG drug databases to train their model, which predicted 142,988 candidate drug combinations from known drug pairs, including some drug regimens for breast and colon cancer [36,49,50]. While the limited complexity of logistic regression classifiers reduces the accuracy of these models, it also enhances their interpretability [17,48].

A more complex classifier model used for drug combination prediction is the support vector machine. Support vector machines (SVMs) are based on kernel functions, which include a variety of mathematical functions used to transform data from a lower to higher dimensionality [35]. Cüvitoğlu and Işik used this classifier method to identify potentially effective antineoplastic drug pairs using single agent gene expression and biological network data [37]. SVMs have also been used in other cancer applications, such as in the identification of cancer methylation signatures, in the prediction of response to chemotherapy, and for analyzing the risk of treatment resistance and tumor recurrence [51,52,53,54]. However, the accuracy of SVMs is often still less than that of complex decision tree-based models such as random forest or XGBoost [55].

Decision trees are a relatively popular classifier machine learning model that takes in data at a root node and continues by some test rule, representing a branch, until the model reaches a decision, or leaf node [35]. These leaf nodes then further branch into the categories of interest by which observations in the data are classified. Approaches based on decision trees include random forest models, gradient boosting, and XGBoost [17,48]. These models are all ensemble approaches, meaning that each model is a combination of several less complex models, where each sub-model is a decision tree. Random forests select a random subset of data from a given dataset, train each model in its ensemble independently, and then use the majority decision from each sub-model to place each observation into a classification category [56]. While random forest models combine their sub-models in parallel, gradient boosting and XGBoost combine their decision trees in series [56,57]. This allows each sequential sub-model to improve upon the prediction of the previous sub-model. XGBoost additionally applies regularization, expanding how applicable the algorithm is to datasets outside of those used to initially train the model, thus enhancing the generalizability of these models compared to gradient boosting [57].

Celebi et al. compared several machine learning methods to discern which model performed best in predicting synergistic anti-cancer drug combinations [38]. Although random forest and XGBoost both performed better than linear regression or support vector machines, XGBoost outperformed random forest after the models were tuned to maximize their performance, so the authors proceeded with XGBoost for all downstream analyses. While decision tree-based methods are interpretable and perform well, their accuracy is generally lower than deep learning approaches.

Deep Learning Models

Deep learning refers to a subclass of machine learning methods capable of handling large amounts of multi-dimensional data that often overwhelms other machine learning methods [40]. Deep learning models are based on units of artificial neural networks, which are multi-layered networks composed of several processing layers. These layers allow the model to learn and make predictions from complex mathematical functions [39]. Not only can deep learning incorporate larger quantities and more complex data types than other machine methods, but this ability to use multi-faceted data also allows deep learning methods to discern significant biological relationships that may not be detected by other machine learning approaches [40]. However, the disadvantage of using numerous features in creating a deep learning model is that it may result in overfitting, an issue in machine learning where the model is fitted too close to the data set used to train it, and is thus unable to generate accurate results for new data sets [58]. Generalizability is thus a concern when developing deep learning models. Another limitation of deep learning techniques is simply the lack of adequate data for these models, as most deep learning approaches for predicting drug response are trained on limited numbers of cell lines. This then inevitably reduces their generalizability to densely heterogeneous patient tumors [43]. This is further exemplified by Prasse et al.’s study, which found that fine-tuning deep neural networks on patient-derived data improves the accuracy of antineoplastic drug response predictions [59].

Despite these limitations, deep learning has still been immensely useful for advancing precision oncology. Deep learning has not only been used to predict several pharmacodynamic properties for drug discovery purposes, such as drug activity and toxicity, but it has also been shown to out-perform other machine learning methods for these tasks as well [60,61,62,63,64]. In the context of cancer drug combination therapy, there have been several tools developed in recent years to predict potentially efficacious drug combination therapies for cancer, using already known antineoplastics or repurposing other approved medications for the disease.

DeepSynergy is regarded as the first deep learning approach developed for the prediction of drug combination synergies. DeepSynergy is a feed-forward neural network. It takes the chemical descriptors of each drug and the cell line genomic information as inputs to calculate synergy scores of drug combinations for cancer cell lines [41]. Another example, CCSynergy is a deep-learning approach that uses drug bioactivity profiles from Chemical Checker for drug synergy prediction, and the use of CCSynergy to explore the untested combinatorial space revealed a compendium of potentially synergistic drug combinations across hundreds of cancer cell lines [44]. More recently, MARSY, a deep learning multi-task model that incorporates the gene expression profiles of cancer cell lines with drug perturbation profiles (i.e., the changes in gene expression of a cell line after drug treatment) was developed to predict synergy scores [46]. While these are currently limited deep learning approaches that have been developed for cancer drug combination prediction, more methods are in active development that aim to incorporate multi-omic features to identify patient-specific anti-cancer drug regimens [45].

Discussion

We have described several computational methods developed to predict synergistic drug combinations to further precision oncology, including networks and machine-learning methods, such as regression models, classifier models, and deep learning frameworks. Each of these methods uses mathematical principles to complete various tasks in drug combination therapy prediction. Networks-based models allow for the visualization of patterns between drug and disease entities to identify candidate targets and therapies. Regression-based approaches can predict missing values in dose-response matrices to improve drug synergy calculations. Classifier methods and neural networks can predict potential anti-cancer therapies by sorting drug and disease data into categories. As these models are intended to perform specific tasks, the purpose of the study must be carefully considered when determining which of these methods to implement in one’s own research. As noted by the DREAM Challenges, which compared several drug combination prediction tools for precision oncology against one another, the specific prediction algorithm matters far less than the principles it is based on and how it can be applied [65,66].

Although the vast majority of drug combination prediction methods are still in the preclinical testing phase, they may soon transition to testing in randomized clinical trials. Recent clinical trials have shown promising results for the future of precision oncology as a whole. For example, the I-PREDICT and ongoing NCI-comboMATCH trials utilize next-generation sequencing (NGS)-guided matching protocols to pair patients to drug combination therapies. The results from the I-PREDICT study showed that a higher degree of matching correlated to improved patient outcomes, thus supporting the efficacy of precision combination therapy in clinical settings [67,68]. The NCI-MATCH study utilized similar NGS methods on patient tumors to identify actionable genomic mutations across several cancer types. Although the patients treated by NGS-guided monotherapies showed improved progression-free survival compared to unmatched patients, only 3% of patients with refractory malignancies carried actionable mutations, demonstrating a need to broaden the scope of signature matching for candidate therapies via multi-omics integration [69]. The WINTHER trial was the first to match patients to drug combination therapies using a matching score based on both genomics and transcriptomics data. Not only was a higher matching score correlated with improved progression-free survival, but a significantly larger percentage of the patient cohort was able to be matched to targeted therapy regimens compared to the previously discussed NGS-guided trials, thus supporting the utility of multi-omics integration in guiding drug therapy prediction for cancer [70].

More recent studies have attempted to expand the scope of multi-omics in combination therapy prediction even further. REFLECT is a machine learning method that incorporates mutational, copy number, transcriptomic, and phosphoproteomics data to generate detailed co-alteration signatures for therapy prediction [71]. Another needed advancement for cancer combination therapy prediction is methods that can be used to monitor disease progression and response to treatment over time, such as Eduati et al.’s approach, which utilizes microfluidics and logic-based models to predict treatments for different stages of pancreatic cancer [72]. Other considerations when developing new drug combination prediction models include the different interactions that can occur between drug combinations across drug dosages [29,73,74], the specificity the predicted drug regimens have for the disease over normal tissue [75,76,77], as well as increased emphasis on prioritizing candidate regimens with maximal efficacy and minimal toxicity, as most current studies attempt to maximize the synergy of drug combination without regard to the fact that this may compound toxicity as well, reducing the tolerability and clinical utility of the proposed therapy [78]. As computational methods improve to better incorporate patient-derived multi-omics data, disease-specific context, and pharmacodynamic considerations, more comprehensive models can be generated to predict effective drug regimens for complex diseases like cancer, reducing drug development time and cost and improving patient outcomes.

Summary:

Computational drug repurposing is a time- and cost-effective alternative complementary to de novo drug discovery.
Combination therapies have numerous advantages over monotherapies, including increased effect from synergistic interactions, reduced toxicity from lowered drug doses, and a reduced risk of resistance due to multiple non-overlapping mechanisms of action.
Computational methods used for drug combination therapy prediction in cancer research include networks, regression-based machine learning, classifier machine learning models, and deep learning approaches.
Advancements in technologies that incorporate disease mechanisms, drug characteristics, multi-omics data, and clinical considerations are needed to generate effective patient-specific drug combinations.

Author Contributions

VLF: Conceptualization, Writing - Original Draft, Writing - Review & Editing; JLF: Writing - Original Draft, Writing - Review & Editing; EJW: Writing - Original Draft, Writing - Review & Editing; TCH: Writing - Original Draft, Writing - Review & Editing; BNL: Conceptualization, Writing - Review & Editing, Supervision, Project Administration, Funding Acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by R03OD030604 (to BNL; supported JLF and BNL), UAB Lasseigne Lab Start-Up funds (to BNL; supported JLF, EJW, TCH, and BNL), and 5T32GM008361-31 (supported VLF). The funders had no role in the conceptualization or writing of the manuscript.

References

Housman G, Byler S, Heerboth S, Lapinska K, Longacre M, Snyder N, et al. Drug resistance in cancer: an overview. Cancers . 2014;6: 1769–1792.
Rueff J, Rodrigues AS. Cancer Drug Resistance: A Brief Overview from a Genetic Viewpoint. In: Rueff J, Rodrigues AS, editors. Cancer Drug Resistance: Overviews and Methods. New York, NY: Springer New York; 2016. pp. 1–18.
Holohan C, Van Schaeybroeck S, Longley DB, Johnston PG. Cancer drug resistance: an evolving paradigm. Nat Rev Cancer. 2013;13: 714–726.
Borst P. Cancer drug pan-resistance: pumps, cancer stem cells, quiescence, epithelial to mesenchymal transition, blocked cell death pathways, persisters or what? Open Biol. 2012;2: 120066.
Alfarouk KO, Stock C-M, Taylor S, Walsh M, Muddathir AK, Verduzco D, et al. Resistance to cancer chemotherapy: failure in drug response from ADME to P-gp. Cancer Cell Int. 2015;15: 71.
Scannell JW, Blanckley A, Boldon H, Warrington B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat Rev Drug Discov. 2012;11: 191–200.
Booth B, Zemmel R. Prospects for productivity. Nat Rev Drug Discov. 2004;3: 451–456.
Sun D, Gao W, Hu H, Zhou S. Why 90% of clinical drug development fails and how to improve it? Acta Pharm Sin B. 2022;12: 3049–3062.
Jarada TN, Rokne JG, Alhajj R. A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J Cheminform. 2020;12: 46.
Shim JS, Liu JO. Recent advances in drug repositioning for the discovery of new anticancer drugs. Int J Biol Sci. 2014;10: 654–663.
Singhal S, Mehta J, Desikan R, Ayers D, Roberson P, Eddlemon P, et al. Antitumor activity of thalidomide in refractory multiple myeloma. N Engl J Med. 1999;341: 1565–1571.
Ning Y-M, Gulley JL, Arlen PM, Woo S, Steinberg SM, Wright JJ, et al. Phase II trial of bevacizumab, thalidomide, docetaxel, and prednisone in patients with metastatic castration-resistant prostate cancer. J Clin Oncol. 2010;28: 2070–2076.
Wilkinson GF, Pritchard K. In vitro screening for drug repositioning. J Biomol Screen. 2015;20: 167–179.
Al-Lazikani B, Banerji U, Workman P. Combinatorial drug therapy for cancer in the post-genomic era. Nat Biotechnol. 2012;30: 679–692.
Ayoub NM. Editorial: Novel Combination Therapies for the Treatment of Solid Cancers. Front Oncol. 2021;11: 708943.
He H, Duo H, Hao Y, Zhang X, Zhou X, Zeng Y, et al. Computational drug repurposing by exploiting large-scale gene expression data: Strategy, methods and applications. Comput Biol Med. 2023;155: 106671.
Güvenç Paltun B, Kaski S, Mamitsuka H. Machine learning approaches for drug combination therapies. Brief Bioinform. 2021;22. [CrossRef]
Pemovska T, Bigenzahn JW, Superti-Furga G. Recent advances in combinatorial drug screening and synergy scoring. Curr Opin Pharmacol. 2018;42: 102–110.
Azuaje F. Drug interaction networks: an introduction to translational and clinical applications. Cardiovasc Res. 2013;97: 631–641.
Menche J, Sharma A, Kitsak M, Ghiassian SD, Vidal M, Loscalzo J, et al. Disease networks. Uncovering disease-disease relationships through the incomplete interactome. Science. 2015;347: 1257601.
Cheng F, Kovács IA, Barabási A-L. Network-based prediction of drug combinations. Nat Commun. 2019;10: 1197.
Federico A, Fratello M, Scala G, Möbus L, Pavel A, Del Giudice G, et al. Integrated Network Pharmacology Approach for Drug Combination Discovery: A Multi-Cancer Case Study. Cancers . 2022;14. [CrossRef]
Hong Y, Chen D, Jin Y, Zu M, Zhang Y. PINet 1.0: A pathway network-based evaluation of drug combinations for the management of specific diseases. Front Mol Biosci. 2022;9: 971768.
Jafari M, Mirzaie M, Bao J, Barneh F, Zheng S, Eriksson J, et al. Bipartite network models to design combination therapies in acute myeloid leukaemia. Nat Commun. 2022;13: 1–12.
Sarmah D, Meredith WO, Weber IK, Price MR, Birtwistle MR. Predicting anti-cancer drug combination responses with a temporal cell state network model. PLoS Comput Biol. 2023;19: e1011082.
Recanatini M, Menestrina L. Network modeling helps to tackle the complexity of drug-disease systems. WIREs Mech Dis. 2023; e1607.
Wallisch C, Bach P, Hafermann L, Klein N, Sauerbrei W, Steyerberg EW, et al. Review of guidance papers on regression modeling in statistical series of medical journals. PLoS One. 2022;17: e0262918.
Amzallag A, Ramaswamy S, Benes CH. Statistical assessment and visualization of synergies for large-scale sparse drug combination datasets. BMC Bioinformatics. 2019;20: 83.
Zimmer A, Tendler A, Katzir I, Mayo A, Alon U. Prediction of drug cocktail effects when the number of measurements is limited. PLoS Biol. 2017;15: e2002518.
Park M, Nassar M, Vikalo H. Bayesian active learning for drug combinations. IEEE Trans Biomed Eng. 2013;60: 3248–3255.
Li DH, Whitmore JB, Guo W, Ji Y. Toxicity and Efficacy Probability Interval Design for Phase I Adoptive Cell Therapy Dose-Finding Clinical Trials. Clin Cancer Res. 2017;23: 13–20.
Yan F, Mandrekar SJ, Yuan Y. Keyboard: A Novel Bayesian Toxicity Probability Interval Design for Phase I Clinical Trials. Clin Cancer Res. 2017;23: 3994–4003.
Pan H, Lin R, Zhou Y, Yuan Y. Keyboard design for phase I drug-combination trials. Contemp Clin Trials. 2020;92: 105972.
Li C, Sun H, Cheng C, Tang L, Pan H. A software tool for both the maximum tolerated dose and the optimal biological dose finding trials in early phase designs. Contemp Clin Trials Commun. 2022;30: 100990.
Rafique R, Islam SMR, Kazi JU. Machine learning in the prediction of cancer therapy. Comput Struct Biotechnol J. 2021;19: 4003–4017.
Iwata H, Sawada R, Mizutani S, Kotera M, Yamanishi Y. Large-Scale Prediction of Beneficial Drug Combinations Using Drug Efficacy and Target Profiles. J Chem Inf Model. 2015;55: 2705–2716.
Cüvitoğlu A, Işik Z. Classification of effects of drug combinations with support vector machines. 2017 25th Signal Processing and Communications Applications Conference (SIU). 2017. pp. 1–4.
Celebi R, Bear Don’t Walk O, Movva R, Alpsoy S, Dumontier M. In-silico Prediction of Synergistic Anti-Cancer Drug Combinations Using Multi-omics Data. Sci Rep. 2019;9: 1–10.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521: 436–444.
Baptista D, Ferreira PG, Rocha M. Deep learning for drug response prediction in cancer. Brief Bioinform. 2021;22: 360–379.
Preuer K, Lewis RPI, Hochreiter S, Bender A, Bulusu KC, Klambauer G. DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics. 2018;34: 1538–1546.
Yang J, Xu Z, Wu WKK, Chu Q, Zhang Q. GraphSynergy: a network-inspired deep learning model for anticancer drug combination prediction. J Am Med Inform Assoc. 2021;28: 2336–2345.
Zhang T, Zhang L, Payne PRO, Li F. Synergistic Drug Combination Prediction by Integrating Multiomics Data in Deep Learning Models. Methods Mol Biol. 2021;2194: 223–238.
Hosseini S-R, Zhou X. CCSynergy: an integrative deep-learning framework enabling context-aware prediction of anti-cancer drug synergy. Brief Bioinform. 2023;24. [CrossRef]
Sharma A, Lysenko A, Boroevich KA, Tsunoda T. DeepInsight-3D architecture for anti-cancer drug response prediction with deep-learning on multi-omics. Sci Rep. 2023;13: 2483.
El Khili MR, Memon SA, Emad A. MARSY: a multitask deep-learning framework for prediction of drug combination synergy scores. Bioinformatics. 2023;39. [CrossRef]
Zimmer A, Tendler A, Katzir I, Mayo A, Alon U. Prediction of drug cocktail effects when the number of measurements is limited. PLoS Biol. 2017;15: e2002518.
James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning with Applications in R Second Edition. Springer Nature; 2021.
Hare D, Foster T. The Orange Book: the Food and Drug Administration’s advice on therapeutic equivalence. Am Pharm. 1990;NS30: 35–37.
Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008;36: D480–4.
Wang X, Shang W, Li X, Chang Y. Methylation signature genes identification of cancers occurrence and pattern recognition. Comput Biol Chem. 2020;85: 107198.
Jiang Y, Xie J, Huang W, Chen H, Xi S, Han Z, et al. Tumor Immune Microenvironment and Chemosensitivity Signature for Predicting Response to Chemotherapy in Gastric Cancer. Cancer Immunol Res. 2019;7: 2065–2073.
Dorman SN, Baranova K, Knoll JHM, Urquhart BL, Mariani G, Carcangiu ML, et al. Genomic signatures for paclitaxel and gemcitabine resistance in breast cancer derived by machine learning. Mol Oncol. 2016;10: 85–100.
Hu X, Wong KK, Young GS, Guo L, Wong ST. Support vector machine multiparametric MRI identification of pseudoprogression from tumor recurrence in patients with resected glioblastoma. J Magn Reson Imaging. 2011;33: 296–305.
Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak. 2019;19: 281.
Srivastava R, Kumar S, Kumar B. 7 - Classification model of machine learning for medical data analysis. In: Goswami T, Sinha GR, editors. Statistical Modeling in Machine Learning. Academic Press; 2023. pp. 111–132.
Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1603.02754.
Kernbach JM, Staartjes VE. Foundations of Machine Learning-Based Clinical Prediction Modeling: Part II-Generalization and Overfitting. Acta Neurochir Suppl. 2022;134: 15–21.
Prasse P, Iversen P, Lienhard M, Thedinga K, Herwig R, Scheffer T. Pre-Training on In Vitro and Fine-Tuning on Patient-Derived Data Improves Deep Neural Networks for Anti-Cancer Drug-Sensitivity Prediction. Cancers . 2022;14. [CrossRef]
Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V. Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model. 2015;55: 263–274.
Lenselink EB, Ten Dijke N, Bongers B, Papadatos G, van Vlijmen HWT, Kowalczyk W, et al. Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. J Cheminform. 2017;9: 45.
Koutsoukas A, Monaghan KJ, Li X, Huan J. Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data. J Cheminform. 2017;9: 42.
Mayr A, Klambauer G, Unterthiner T, Hochreiter S. DeepTox: Toxicity Prediction using Deep Learning. Front Environ Sci Eng China. 2016;3. [CrossRef]
Korotcov A, Tkachenko V, Russo DP, Ekins S. Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets. Mol Pharm. 2017;14: 4462–4475.
Saez-Rodriguez J, Costello JC, Friend SH, Kellen MR, Mangravite L, Meyer P, et al. Crowdsourcing biomedical research: leveraging communities as innovation engines. Nat Rev Genet. 2016;17: 470–486.
Cichońska A, Ravikumar B, Allaway RJ, Wan F, Park S, Isayev O, et al. Crowdsourced mapping of unexplored target space of kinase inhibitors. Nat Commun. 2021;12: 3307.
Sicklick JK, Kato S, Okamura R, Schwaederle M, Hahn ME, Williams CB, et al. Molecular profiling of cancer patients enables personalized combination therapy: the I-PREDICT study. Nat Med. 2019;25: 744–750.
Meric-Bernstam F, Ford JM, O’Dwyer PJ, Shapiro GI, McShane LM, Freidlin B, et al. National Cancer Institute Combination Therapy Platform Trial with Molecular Analysis for Therapy Choice (ComboMATCH). Clin Cancer Res. 2023;29: 1412–1422.
Flaherty KT, Gray R, Chen A, Li S, Patton D, Hamilton SR, et al. The Molecular Analysis for Therapy Choice (NCI-MATCH) Trial: Lessons for Genomic Trial Design. J Natl Cancer Inst. 2020;112: 1021–1029.
Rodon J, Soria J-C, Berger R, Miller WH, Rubin E, Kugel A, et al. Genomic and transcriptomic profiling expands precision cancer medicine: the WINTHER trial. Nat Med. 2019;25: 751–758.
Li X, Dowling EK, Yan G, Dereli Z, Bozorgui B, Imanirad P, et al. Precision Combination Therapies Based on Recurrent Oncogenic Coalterations. Cancer Discov. 2022;12: 1542–1559.
Eduati F, Jaaks P, Wappler J, Cramer T, Merten CA, Garnett MJ, et al. Patient-specific logic models of signaling pathways from screenings on cancer biopsies to prioritize personalized combination therapies. Mol Syst Biol. 2020;16: e8664.
Zimmer A, Katzir I, Dekel E, Mayo AE, Alon U. Prediction of multidimensional drug dose responses based on measurements of drug pairs. Proc Natl Acad Sci U S A. 2016;113: 10442–10447.
Julkunen H, Cichonska A, Gautam P, Szedmak S, Douat J, Pahikkala T, et al. Leveraging multi-way interactions for systematic prediction of pre-clinical drug combination effects. Nat Commun. 2020;11: 6136.
He L, Tang J, Andersson EI, Timonen S, Koschmieder S, Wennerberg K, et al. Patient-Customized Drug Combination Prediction and Testing for T-cell Prolymphocytic Leukemia Patients. Cancer Res. 2018;78: 2407–2418.
He L, Bulanova D, Oikkonen J, Häkkinen A, Zhang K, Zheng S, et al. Network-guided identification of cancer-selective combinatorial therapies in ovarian cancer. Brief Bioinform. 2021;22. [CrossRef]
Ianevski A, Lahtela J, Javarappa KK, Sergeev P, Ghimire BR, Gautam P, et al. Patient-tailored design for selective co-inhibition of leukemic cell subpopulations. Sci Adv. 2021;7. [CrossRef]
Kong W, Midena G, Chen Y, Athanasiadis P, Wang T, Rousu J, et al. Systematic review of computational methods for drug combination prediction. Comput Struct Biotechnol J. 2022;20: 2807–2814.

Table 1. Summary of computational methods discussed in this review with references to articles offering further explanation and examples of these methodologies.

Method	Definition	Advantages	Disadvantages	Example Implementations
Network	Graphical representation of biological entities, such as genes, proteins, transcription factors, phenotypes, and drugs and how they relate to one another	Provide a visual representation of the relationships between biological entities, which may reveal disease physiology or drug mechanism of action. Allow for integration of several data types.	Interpretability may be difficult to those inexperienced in network biology. Drug-target genes may not be detected due to lack of gene expression changes. Many false-positives due to low accuracy of drug-target interaction networks.	[19] [20] [21] [22] PINet [23] [24] [25] [26]
Regression	Subclass of machine learning that determines whether the relationship between two variables fits a known mathematical pattern (i.e., linear, logarithmic, polynomial, etc.)	Higher interpretability compared to other machine learning models due to comparatively lower model complexity Capable of fitting data to multiple types of mathematical patterns	Requires one to know which specific type of mathematical pattern exists between two variables to make accurate predictions	[27] [28] Pairs model [29] [30] [31] Keyboard [32,33,34]
Classification	Subclass of machine learning that places observations in the data into specific categories	Highly versatile, as several methods fall into this subclass (logistic regression, support vector machines, random forest, gradient boosting, XGBoost)	Trade-off often exists for model accuracy and interpretability	[35] [17] [36] [37] [38]
Deep Learning	Subclass of machine learning that uses multi-layered networks composed of several processing layers to make predictions from large and complex data types	Can handle large, multi-faceted data types that often overwhelm other methods Can discern significant biological relationships often overlooked by other methods High accuracy	High tendency of overfitting, which reduces the generalizability of these models Low interpretability, giving these types of models the reputation of being “black boxes”	[39] [40] DeepSynergy [41] GraphSynergy [42] AuDNNSynergy [43] CCSynergy [44] DeepInsight-3D [45] MARSY [46]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.