SynPred: Prediction of Drug Combination Effects in Cancer using Full-Agreement Synergy Metrics and Deep Learning

Simple Summary: Drug resistance in cancer is a multifactorial problem that can significantly com-promise patient treatment, leading to higher mortality and morbidity. The administration of synergistic drug combinations can minimize this adaptation phenomenon; however, choosing the best combination for each cancer type is still difficult and complex. Our study aimed to develop a Machine Learning model that predicts the best synergistic anticancer drug combinations against specific cancer cell lines by integrating the four most used drug synergy metrics (Bliss, Loewe, HAS, and ZIP) with multi-omics features of cancer cell lines, phenotypic and biophysical data. Our model SynPred is an ensemble classifier that is fully integrated for the first time into a user-friendly webserver allowing users with different backgrounds, from scientists to clinicians, to use this model from drug synergy prediction, requiring only the upload of the two drug SMILES to be tested. Abstract: High-throughput screening technologies continues to produce large amounts of multi-omics data from different populations and cell types for various diseases, such as cancer. However, analysis of such data encounters difficulties due to cancer heterogeneity, further exacerbated by human biological complexity and genomic variability. There is a need to redefine the drug discovery development pipeline, bringing an Artificial Intelligence (AI)-powered informational view that in-tegrates relevant biological information and explores new ways to develop effective anticancer approaches. Here, we show SynPred, an interdisciplinary approach that leverages specifically de-signed ensembles of AI-algorithms, links omics and biophysical traits to predict synergistic anti-cancer drug synergy. SynPred exhibits state-of-the-art performance metrics: accuracy – 0.85, precision – 0.77, recall – 0.75, AUROC – 0.82, and F1-score - 0.76 in an independent test set. Moreover, data interpretability was achieved by deploying the most current and robust feature importance approaches. A simple web-based application available online at http://www.moreiralab.com/resources/synpred/ was constructed to predict synergistic anticancer drug combinations requiring only the upload of the two drug SMILES to be tested, allowing easy access by non-expert research-ers.


Introduction
Cancer, a heterogeneous group of diseases, is one of the leading causes of mortality and the most significant barrier to increase life expectancy worldwide. The International Agency for Research on Cancer estimates that, by 2040, approximately 29.5 million new cancer cases and 16.4 million deaths will be reported mainly due to the population's growth and ageing [1]. One of the significant contributors to this disease's global burden is the development of therapy resistance and, consequently, tumour relapse. Drug resistance in cancer is a multifactorial problem driven by the tumour microenvironment and genetic and nongenetic/epigenetic mechanisms that, along with cell plasticity, contribute to tumour heterogeneity [2]. In clinical settings, this problem is dealt with a combination of drugs administered together or in sequence, i.e., polytherapy. Targeting multiple components of different or interconnected cancer pathways is an efficient strategy to block vital biological processes [3].
Drug combinations with a synergistic effect, i.e., when the total therapeutic effect of both drugs is greater than the expected additive monotherapy effect [4], were successfully developed and applied in the treatment of different types of tumours, such as human epidermal growth factor receptor 2-positive breast cancer [5], chronic myeloid leukaemia [6], prostate cancer [7] or BRAF-mutant melanoma [8]. Nevertheless, this simultaneous administration can also result in a reduced therapeutic effect and possible toxicity (designated antagonism) or in the same beneficial effect when compared with the expected additive monotherapy effect (additivity) [4]. The experimental identification of successful synergistically effective combinations that amplify each other's activity is a well-known time-consuming, and expensive task. Therefore, there is still a significant need for efficient and user-friendly computational methods to complement and speed-up the traditional approaches by predicting the best synergistic drug combinations [9,10].
In the last years, the development and improvement of high-throughput technologies and computational tools boosted the use of large volumes of multi-omics data (e.g., genomic, transcriptomic, proteomic) essential to dissect and uncover the complex molecular signatures of cancer. Machine Learning (ML) algorithms, in particular, have attracted particular attention for their ability to learn new associations and extract useful insights from this type of data. A few ML models based on eXtreme Gradient Boosting, Random Forest, Elastic Nets, Support Vector Machine, and Naïve Bayes were already developed to predict the best combination of anticancer drugs by the integration of omics data with chemoinformatic properties of drugs or network information of their targets [11][12][13][14]. Likewise, Deep-Learning (DL) implemented via Deep-Neural Networks (DNNs) was particularly useful in dealing with the high multi-dimensionality of omics data in supervised and unsupervised contexts. DNNs classification and regression models such as DeepSynergy [15], AuDNNsynergy [16], MatchMaker [17] or DeepSignalingSynergy [18] were recently developed for drug combination prediction. However, a critical bottleneck was the use for class definition of a single drug synergy metric to access the degree of interaction, with very few using alternative approaches [14,19]. Besides, the vast majority require advanced knowledge in bioinformatics to be used and are only available through GitHub, posing several constraints for accessibility by the medical and scientific community. Most of the available web interfaces are mainly dedicated to drug combination response prediction using ML regression models, such as DECREASE [20] or DrugComb [21]. Notwithstanding, these require at least a set of laboratory experiments to upload a full or partial mandatory dose-response matrix, which difficult its use by the scientific community.
To overcome the current problems found in the field, we developed SynPred (SYNergy PREDiction), an in-silico ensemble classification model that considers several drug synergy metrics (Bliss, Loewe, HAS, and ZIP) for the prediction of effective drug combinations with accuracy, precision, recall, Area Under the Receiver Operator Curve (AU-ROC), and F1 scores of 0.85, 0.77, 0.75, 0.82, and 0.76 respectively. This model was developed by integrating not only multi-omics features of cell lines, phenotypic data but also biophysical data, in particular, physicochemical and structural features of drugs. SynPred was independently tested and validated in a new comprehensive database of synergistic drug combinations from the Ianevski study [20], achieving an accuracy of 0.98. We made available the stand-alone deployment at https://github.com/MoreiraLAB/synpred, which allows the user the opportunity to undergo bulk prediction with SynPred. Additionally, for the first time, a user-friendly web-based application was assembled and made freely available online at http://www.moreiralab.com/resources/synpred/ to predict drug combinations, requiring only the upload of the two drug SMILES to be tested. This interactive platform will allow users with different backgrounds, from scientists to clinicians, to test, reproduce and validate our models and data. The workflow used for the development of SynPred is depicted in Figure 1. Anti-Neoplastic Agent Combinations database (phenotypic data) and the Cancer Cell Line Encyclopedia (CCLE) (multiomics data) were used for this purpose. Four reference models (Zero Interaction Potency-ZIP, Highest Single Agent-HSA, Bliss, and Loewe) were used to quantify the degree of combination and retrieve a full agreement between all metrics. (Orange) -Feature extraction and data pre-processing. Included normalization and dimensionality reduction using autoencoder or Principal Component Analysis (PCA). (Blue) -Prediction model development using a training set. This model was developed using different ML algorithms and a final ensemble model. (Red) -Model evaluation using an independent test set. The accuracy, precision, recall, Area Under the Receiver Operator Curve (AUROC), and F1-score, metrics were calculated.

Data acquisition and processing
2.1.1. Experimental drug combination phenotypic data Drug combination phenotypic data was acquired via bulk-download from the largest-to-date dataset from National Cancer Institute -A Large Matrix of Anti-Neoplastic Agent Combinations (NCI-ALMANAC) through https://wiki.nci.nih.gov/display/NCIDTPdata/NCI-ALMANAC [22]. To this date, the dataset includes phenotypic data of tested cancer cell lines (growth percentage) of 104 unique FDA-approved drugs. These drugs were tested in combination against 59 cell lines from 9 cancer types currently included in the NCI [23,24], comprising a total of 311.466 drug pair/cell line combinations. Drug sensitivity assays included in NCI-ALMANAC were performed at the NCI's Frederick National Laboratory for Cancer Research, the Stanford Research Institute, and the University of Pittsburgh. Briefly, for each assay, cells were cultivated for 48 hours in a 3x3 or a 5x3 concentration matrix (different concentration values for each drug in combination) and the endpoint determined by Sulforhodamine B or CellTiter-Glo [22]. From these records, the authors retrieved the cell growth percentage at each drug concentration point, which corresponds to the percentage of growth of the cell lines in the presence of each combination, yielding a final viability assessment.

Combination scores and class definition
The phenotypic data from high-throughput drug combination screens were analysed using the "SynergyFinder" [25] R package (version 2.0.3). We used the percentage of cell growth included in the dataset to assess the degree of combination for each concentration matrix. For that, we applied four reference models: Bliss independence (Equation 1) [ With this data, a binary classifier was developed to identify the type of combinatory effect present in each drug pair-cell line sample, where the 20% smaller values were classified as synergistic, and the remaining ones were classified as non-synergistic ( Figure  A1). The dataset used for training considered full-agreement combination assessment, i.e., we only kept the instances on which combination classification was the same across the four previous reference classifiers. For the dataset used, this process yielded 25.354 synergistic samples and 57.266 non-synergistic samples.

Drug molecular descriptors
Each drug included in NCI-ALMANAC was analysed to extract its physicochemical and structural features. A Simplified Molecular-Input Line-Entry System (SMILE) representation of the drugs was acquired from PubChem [31]. SMILEs were then used to mine molecular descriptors using the Python package "Mordred" (version 1.1.2) [32]. In total, was retrieved an array of 1.613 numeric features of 43 different categories making a twodimensional molecular description of the drugs. Feature-arrays comprising non-numerical attributes or displaying zero variance were deleted. This pre-processing left 586 features describing each drug included in NCI-ALMANAC, distributed across 28 categories ( Table 1). The resulting features were subjected to normalisation by removing the mean and scaling to unit variance with scikit-learn's StandardScaler [33]. Omics data (expression, copy number variation, and methylation, global chromatin profiling, metabolomics, microRNA, proteomic profiling) describing the cancer cell lines were acquired via bulk download from the Cancer Cell Line Encyclopedia -CCLE (https://portals.broadinstitute.org/ccle/data) [34]. The number of cell lines included in CCLE varies depending on the type of omics data available at the time ( Table 2). Correspondence of cell line IDs between NCI-ALMANAC and CCLE was performed according to data available at the Swiss Institute of Bioinformatics Cellosaurus Website [35]. According to the affected tissue, annotations acquired through Cellossaurus split the CCLE cell lines into 21 different cancer types. In agreement with the original publications [34,36], expression data were obtained through RNA-sequencing and processed to obtain level expression in transcripts per million by the expectation-maximization algorithm (file: CCLE_RNAseq_rsem_genes_tpm_20180929.txt.gz). Copy Number Variation (CNV) data were acquired from the Affymetrix SNP6.0 Arrays (file: CCLE_copynumber_byGene_2013-12-03.txt.gz). Copy numbers were normalized by the most similar HapMap normal samples [37]. Segmentation of normalized log2 (CN/2) ratios was achieved using the circular binary segmentation algorithm [34,38]  Data were normalized by removing the mean and scaling to unit variance with scikitlearn's StandardScaler [33]. Due to the omics data's high complexity, we performed dimensionality reduction to minimize the noise introduced in the dataset by highlighting the essential features. The datasets already described were used to build and train a Multi-Layer Perceptron (MLP) autoencoder, an unsupervised Artificial Neural Network (ANN) with a typical "hourglass" architecture, which is often used to perform dimensionality reduction in vast and high-dimensional datasets such as the ones observed with omics data [39][40][41]. This type of MLPs usually consists of three parts; an encoder that abstracts the input into hidden variables, i.e., a latent-space representation, a bottleneck layer that holds the smallest Hidden Layer (HL) (for purposes of dimensionality reduction, this is the layer that defines the size of the reduced dataset), and a decoder that reconstructs the original input data from the hidden data [42,43]. Seven autoencoders, one for each of the CCLE feature blocks, were developed by using Keras with a TensorFlow for Graphics Processing Units (GPU) (Version 2.3.1) backend [44]. Each of the autoencoders comprised seven layers, of which five were HLs. The input and output layers follow the number of available features in all cell lines, as displayed in Table 2. The number of nodes within the bottleneck layer of each of the seven autoencoders (used for extraction of the encoded features) corresponds to the autoencoder's final number of features. The two HL in each of the encoder and decoder sections vary in size according to the number of samples and features available (Table A1). In this stage, all models used Adam [45] as an optimizer function with a learning rate of 0.001. Rectified Linear Unit (ReLU) activation function was used in all layers. Mean Square Error (MSE) was used as a loss function. The models were trained for 1000, 250, or 100 epochs, depending on the dataset size (Table A2). After training, each autoencoder's bottleneck layer was used to perform dimensionality reduction of the omics data.
PCA, a commonly used method for dimensionality reduction [46], was also applied in the same datasets as the autoencoder, for which 25 Principal Components (PCs) were defined. It means that, by using PCA, each of the datasets was transformed to yield only 25 features, totalling 175 features to describe each unique cell line. As shown in Table 2, each feature block from CCLE had its variance explained in a range from 0.73 to 0.99. Since the seven blocks were used simultaneously for each sample, each cell line is thoroughly described by the components extracted with the PCA. Missing values (in both autoencoder and PCA) were processed by either dropping the sample entirely or replacing the missing values by zero. Both these approaches were performed after keeping the samples that complied with our full-agreement standard. Finally, we randomly split the data into training and test on an 80-20 ratio for model fitting (train) and evaluation (test) ( Table  A3).

Multi-Layer Perceptron with Keras
The binary classification was fully developed using Keras with a TensorFlow (Version 2.3.1) backend [44]. Weights were updated using Adam optimizer [45] and a learning rate of 0.0001 along 250 epochs with binary cross-entropy as the loss function. All the HL were connected through ReLU activation, while the output layer was subject to sigmoid activation. As an initial approach, we performed a gridsearch for parameter optimization using 10% of the training set, fully detailed in the section "Parameter optimization". The best performing parameters were further selected and used to train the models using the full dataset.

ML algorithms with scikit-learn
The datasets presented in this work were also trained with the most commonly used algorithms for synergy prediction tasks, namely with MLPs [45,[47][48][49], RF [50], ETC [51,52], Support Vector Machines (SVM) [53], Stochastic Gradient Descents (SGD) [54], KNN [55], and XGBoost [56]. The MLP, RF, ETC, SVM, SGD, and KNN models were built using the Python package "SciKit Learn" (Version 0.22.1) [33]. The XGBoost model was built using its dedicated package for Python (available at the Python Package Index as "xgboost") [56]. These six algorithms (except for scikit-learn's MLP) were also subject to gridsearch for parameter's optimization using 10% of the training set as described in the following section, with the best ones used to train the models with the full dataset.

Parameter optimization
In order to properly perform parameters' optimization in all the algorithms described, a grid search was performed using in-house built scripts for Keras DL models, and scikit-learn's GridSearchCV with 3-fold cross-validation (for ML algorithms with scikit-learn). We used 10% of the training set [57], a value that is in agreement with subset usage for parameter optimization [58], since using the full training dataset would increase exponentially an already long task. For Keras DL models, the gridsearch with 288 runs was performed with parameters covering the four available datasets, 30 different network architectures, and five different dropout rates (Table A4). In the case of the six ML algorithms (except for scikit-learn's MLP) a total of 820 runs including parameters and dataset combinations were used (Table A5).

Ensemble algorithms
After training both the DL and the remaining ML algorithms, the best-performing ones and their respective datasets were selected to generate an ensemble binary classifier trained with the full dataset. We measured the average probability prediction of all the selected algorithms for control. Furthermore, we deployed a new gridsearch for ensemble parameter optimization, taking the class probability of the selected algorithms as features, and developed a neural network that worked as an ensemble method (Table A6). This neural network had a learning rate of 0.0001, trained for three epochs, used the Adam optimizer [45] and binary cross-entropy. Furthermore, all the seven algorithms.

Model evaluation and performance metrics
The binary classification models were evaluated through accuracy (acc), precision (prec), recall (rec), AUROC as well as F1-score as previously described [52].

Benchmark analysis using the DECREASE database
To ensure the final model's generalizability, we performed a literature review to search for databases that could be used to perform the benchmark and for which we could calculate/retrieve the necessary features. The O'Neil benchmark dataset could not be used as, after processing the dataset, there were not enough single dose responses associated with each concentration in the combination datasets [59]. Regarding the Forcina [60] and Mathews [61] datasets, the cell lines used were not available at CCLE, which renders cell line-associated feature extraction impossible. The chosen dataset that matched all the benchmark requirements was the DECREASE [20], comprising 210 unique drug-drug-cell line combinations corresponding to 34 drugs and 13 cell lines.

Assessing feature contribution among the prediction models
Due to the dimensionality reduction of cell lines, it was necessary to break the process into two stages to access feature contribution. Firstly, since the best performing dimensionality reduction approach was the PCA, the explained variance by each of the features in relation to the respective Principal Component was considered. This information was then extracted has an attribute from the PCA object using scikit-learn [33]. Secondly, the package eli5 [62], with Python deployment, was used to assess final feature weight by deploying Permutation Importance [50], a method that allows iterative exclusion of each of the features, to assess its contribution to the predictive model. The Permutation Importance was deployed on the test set because if the training set had been used, it would not be possible to assess the feature contribution under unbiased conditions. However, it is worth noting that this evaluation occurs after all model training; hence, not influencing the test results in any way.

Web-based application interface implementation
SynPred prediction models were implemented in a web-based application at http://www.moreiralab.com/resources/synpred/. The website's plots and front-end were constructed with plotly [63] and Flask [64], both freely available Python packages, on a framework that uses an in-house adaptation of Javascript, CSS, and HTML scripts. All the back-end hosting was mediated with Flask [64].

Tuning and choosing the best ML parameters
ML performance and training time are deeply affected by specific model parameters, so an appropriate choice of the best ones should always be performed. With that in mind, we used a gridsearch approach to test a comprehensive array of parameters and dataset combinations, including several ML methods, a wide set of DL configurations and preprocessing setups. The choice of these parameters totalled 1.112 runs, of which were selected four best Keras based sets of parameters and the best configurations for each of the seven ML models trained with scikit-learn (Table A7 and Table A8). Regarding the preprocessing datasets, both PCA and PCA_drop perform very similarly for all metrics. As such, we chose to proceed with the PCA dataset, as the increased number of samples can lead to better results when using the full training set with the already tuned models. However, autoencoder datasets performed worse in the train sets and slightly worse for the test set. Due to the significantly larger size of the autoencoder datasets and their slight underperformance, they were excluded from further training. When considering the tested DL parameters, particularly Keras architectures, the best performing dropout rate was 0, and the best architectures, selected according to the best performance for each metric were: accuracy -(2114, 1057, 264, 16), precision -(2500, 2500), recall -(100, 100, 100, 100), AUROC -(2114, 1057) and F1-score (2500, 2500) ( Table A7).

Measuring feature contribution for model development
To understand the importance of each group of included features for the final model performance, and to attain a higher more interpretable model, we analysed each of the individual models with Permutation Importance. We perceived that more complex models, particularly DL-based models, tend to make a more extensive use of the omics-based features (over 70% of the total feature contribution) (Figure 2A), while simpler models, such as K-Nearest Neighbors (kNN), made exclusive use of the drug features ( Figure 2B). Other non-DL based models, such as Extreme Gradient Boosting (XGboost), made residual (around 7%) usage of the omics features ( Figure 2C). This observation proves to be critical to find drug pair-cell line combinations specific for each cancer tissue. . The left and right axes represent the association between the multiomics data or drug features and the ML model, while the connection's width is proportional to each feature's contribution to the final predictive model. The colour scheme represents CCLE multiomics data (light blue -chromatin profiling, salmon -copy number variation, light green -miRNA; red -methylation, dark blue -metabolomics, green -expression, pink -antibodies) and drug descriptors (orange -drug1, purple -drug 2). We then looked for a possible biological relevance of the Top 5 genes in each group of the most critical multiomics features to understand if genes contributing more were also implicated in tumorigenesis. Of the 15 ranked genes from expression, methylation and CNV variations, all of them, except for C11orf52 and DAZ2, the last one mainly associated with male infertility, are used as prognostic cancer markers or have a role in tumour progression and treatment ( Table 3). These data suggest that our models, especially DNNs, are likely to capture the most relevant information for each group of multiomics features for synergistic drug combinations. The remaining ranked genes organised by each ML model's best-contributing features are presented in interactive Sankey diagrams on the website landing page (Database dropdown menu at webserver page). Gene involved in progression of several solid tumours [70] 1 The protein description and biological importance were retrieved from The Human Proteins Atlas (https://www.proteinatlas.org/) and The Human Gene Database (https://www.genecards.org/). When this information was not listed in these databases, we presented the study that supports the biological relevance. Favourable and unfavourable is related to gene contribution for cancer progression.

SynPred model for drug combination prediction
After selecting the best parameters for both DL with Keras and ML with scikit-learn, we trained eleven models (four from Keras and seven from scikit-learn) with the full training set. With a class imbalance of over two negative samples for each positive, it was crucial to consider several metrics on independent test set evaluation. Prior to ensemble deployment, the best scikit-learn ML model in the independent test set was XGBoost (acc=0.84, prec=0.78, rec=0.68, AUROC=0.80, F1=0.73) followed by decision-tree-based (acc=0.83-0.84, prec=0.77-0.81, rec=0.62-0.64, AUROC=0.78, F1=0.70) and DL (acc=0.80, prec=0.69, rec=0.67, AUROC=0.76, F1=0.68) ( Table A9 and Table A10). ML models for synergy prediction with good performance values were already developed using Extreme Randomized Trees (ETC) (AUROC=0.89-0.95, Area Under the Precision-Recall Curve=0.51-0.71) algorithms, although without the incorporation of multiomics data [19]. Contrarily, Random Forests (RF) and XGBoost-based models trained with multiomics data such as cell lines expression, mutation, CNV and/or methylation presented slightly lower performance values (AUROC=0.68-0.75, Weighted Average Pearson Correla-tions=0.32-0.39) [12,13]. Besides methodological variations, we should highlight that differences in the database used (NCI-ALMANAC versus O'Neil) and the inclusion of multiomics data could justify the differences in models' performance. In fact, we demonstrated that chromatin profiling, metabolomics, miRNA, CNV, expression, methylation, and antibodies data tend to have a lower contribution in the development of non-DLbased models (Figure 2). Given the importance of multiomics features for cell lines characterization and treatment response [71], we believe that their inclusion is of utmost importance for developing accurate synergy prediction models.
The eleven models previously selected and trained with the full training dataset were used to make ensemble predictors by attaining each class probability. The ensemble models trained were then analysed by the best performing metrics ( Table 4). All the best performing models were DL models trained with Keras with dropout rate different from 0. Our final SynPred model achieved an accuracy, precision, recall, AUROC, and F1-score of 0.85, 0.77, 0.75, 0.82 and 0.76, respectively, on an independent test set. This model is an ensemble model of four DL-based models and seven ML-based models, attained with a DL model with 4 hidden layers of size 50, a dropout rate of 0.60 and PCA pre-processing of the omics features with replacement of the missing values with 0. We then compare our model performance metrics with the current state-of-art classification models that used neural networks for synergy drug combinations prediction and the O'Neil et al. dataset. Our model, SynPred, achieved much better precision (0.77) than other relevant methods in the area such as DeepSynergy (0.56) [15], and AuDNNsynergy (0.72) [16], although it performed slightly worse in terms of accuracy (0.85 versus 0.92-0.93) and AUROC (0.82 versus 0.90-0.91). We hypothesize that the slightly lower predictive performance in accuracy and AUROC arises from the fact that we used a stricter full agreement for synergy class definition, while DeepSynergy and AuDNNsynergy only used the Loewe additivity model. Besides, the initial balance of the dataset used in our study, albeit skewed slightly towards the negatives, is much more balanced than the O'Neil dataset. This can easily lead to lower values in some metrics although producing an overall superior performance predictor. Notwithstanding, considering all the individual metrics, our model shows a very high performance, with a great balance between true positive and negative predictions. Table 4. Best performing ensemble methods, with the result for the mean probability (averaged class probability from each of the eleven models trained with the full training set) shown as a control.

Benchmarking with DECREASE
The DECREASE database [20] containing 34 drugs, 13 cell lines, and 210 combinations was processed to attain the features associated with its data using the same pipeline applied to SynPred. The samples corresponding to MCF-10A and HEK293 cell lines were disregarded, as these were not present in CCLE, making the acquisition of multiomics features impossible. From the two original published datasets, one with 192 and the other with 18 samples, a joint dataset was generated with 188 drug-drug-cell line unique combinations. When calculating the class with our strict full-agreement requirements, all the possible samples turned out to be negative, which indicates that the dataset, although appropriate, seems to be limited in the range of synergy values for both classification and regression tasks. When predicting the class, our SynPred ensemble achieved 0.98 of accuracy.

Web-based application description
The method for predicting the type of combinatory effect in each dug pair-cell line sample is available as a web-based application at http://www.moreiralab.com/resources/synpred/. All the eleven described single models are deployed on user submission, as well as the ensemble approach. The user needs to submit two drugs as input in the *.smile format and selects from a dropdown menu, the primary body site corresponding to the tested cancer cell lines. The drugs are then subject to feature extraction by Mordred and a standard pre-processing (feature elimination and normalization) as thoroughly described in the material and methods section. The output, displayed in a downloadable heatmap, is a binary drug combination prediction effect (synergistic and nonsynergistic) for each of the individual cell lines and the average of both models based on the prediction values. The results are returned to the provided email and displayed on the submission web page. Additionally, users can assess, explore, and visualize through different plots as well as export a summary of the synergy scores (calculated using ZIP, Bliss, HSA and Loewe metrics) by cell line used to develop the original dataset of SynPred. To our knowledge, this is the first webserver that can predict new drug synergy combinations without the need of uploading a partial or full dose-response matrix. This feature is an advantage compared with regression models implemented in webservers that need these types of data for drug combination response prediction [20,21].

Discussion
Synergistic anticancer drug combinations are a powerful tool to help tackle cancer drug resistance since they can simultaneously target multiple key molecules or pathways. The rational design of combination therapies is warranted to improve the efficacy, although this is a well-known time-consuming and expensive task. In recent years, ML algorithms' applicability for drug-repurposing or novel drug design has been essential to demonstrate the importance of in silico methodologies to help overcome this problem. A few classification ML models for predicting drug synergy combinations were already developed [12,13,15,16,19], although the suitability of most of them is hindered by the use of only one reference model for the calculation of drug synergy (e.g., Bliss, Loewe, HSA, or ZIP). Given the different sensitivity observed between these reference models in evaluating the degree of combination, a more comprehensive and rigorous approach that leverage all metrics to predict drug synergy is still needed.
This study introduced a new synergy prediction model, SynPred, that combines comprehensive multiomics data of cancer cell lines with physicochemical and structural features of drugs. This work is one of the first that takes full agreement class between the four most used synergy metrics and uses one of the most comprehensive and equilibrated databases in terms of class balance, the NCI-ALMANAC. Our top-ranked model, an ensemble developed with the best machine learning models, achieved state-of-the-art performance to predict synergistic drug combinations in an independent dataset. Besides, we provide the complete workflow in our GitHub coupled with a freely available and easyto-use webserver that only requires two drug SMILES as inputs, thus alleviating the need of uploading a conventional and laborious dose-response matrix. SynPred can be a valuable tool to the scientifical and medical community for drug repurposing or in-silico discovery of new anticancer drug combinations.
Additionally, given the importance of multiomics data in cell line classification and therapy response, we combined all the available multiomics features in the CCLE database to explore their individual contribution to model development. The knowledge mined from this analysis demonstrates the capacity of different ML models to deal with multiomics data, with DL algorithms being much more able to learn and leverage this complex type of features. We found that the most ranked proteins in each of the most contributing multiomics features are important cancer biomarkers or have a role in tumorigenesis, demonstrating DNN models' capacity to capture their significance and use this information for the final model development. In the future, we expect to include protein-protein interactions data and network analysis to improve the model performance, aiming to identify drug combinations with potential new targets across different cell lines.

Conclusions
In summary, our Machine Learning model can accurately predict synergistic anticancer drug combinations simply using as input the two SMILES of the drugs being tested, not relying on the need of a time-consuming and laborious dose-response matrix. This is a non-time-consuming approach that can accelerate discovering new anticancer drugs combinations with synergistic activity.   Table S3: Final datasets to be subjected to training, Table S4: Gridsearch combination parameters using 10% on the training set with DL algorithms, Table S5: Gridsearch combination parameters using 10% on the training set with non-DL algorithms, Table S6: Gridsearch combination parameters of the ensemble neural network, Table S7: Gridsearch results regarding the runs with 10% randomly selected entries of the training dataset for the DL models trained with Keras, Table  S8: Gridsearch results regarding the runs with 10% randomly selected entries of the training dataset for the ML models trained with scikit-learn, Table S9: Results for the best scikit-learn ML models, MLP, RF, ETC, SVM, SGD, KNN and XGBoost, when evaluated in the training set and an independent test set, Table S10: The results for DL with Keras, when evaluated in the training set and an independent test set, after the gridsearch from the training random subset.