Computational Genomics towards Systems Metabolomics for Precision Oncology

Coordinated sets of extremely numerous digital data, on a given social or economic event, are treated by Artificial Intelligence tools to obtain reasonably accurate, valuable predictions. The same approach, applied to biomedical issues, as how to choose the right drug to completely cure a given cancer patient, does not reach satisfactory results. It is the “organized biological complexity”, which requires a different systems approach, to integrate, in an Augmented Intelligence strategy, statistical computations of digital data, network construction of “omics” findings, well-designed mathematical models and new experiments in an iterative pathway to reconstruct the “logic” beneath the “organized complexity”, as shown here for Systems Metabolomics of cancer. On this basis new diagnostic approaches, able to identify precision drug treatments, as well as new discovery strategy for more effective anti-cancer drugs are described. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 6 August 2018 doi:10.20944/preprints201808.0127.v1 © 2018 by the author(s). Distributed under a Creative Commons CC BY license. 2 Introduction Almost fifty years ago the “war on cancer” was declared by President Nixon, and an enormous amount of studies on cancer has been carried on afterwards, not only in USA, but worldwide. Nevertheless today it is urgently felt the need for a “Moonshot on cancer” in USA and a similar project is proposed to be set up in the Framework Program 9 of EU, with the aim to “ensure the survival of three out of four patients by 2030” (1), thereby indicating how helpless we still feel today against this disease. In fact, notwithstanding the encouraging words from clinicians and scientists, cancer remains a disease that endangers too many lives (2) and brings deep sorrow to most families in the world. So far, the war on cancer has been technology-driven: at first, recombinant DNA technologies brought the identification of oncogenes and tumor suppressor genes, then genomic sequencing and transcriptome analysis have allowed to characterize both the set of mutated oncogenes and the very complex and variable transcriptome expression profiles, which appear to characterize each different cancer cells, both in human cell lines and in clinical samples (3-6). Enormous amounts of omics data of different kinds have been collected by refined analytical technologies, analyzed by a variety of clever bioinformatic tools, stored and made available for retrieval in qualified data banks. When connected with information on which drugs have been used for each patient, that was characterized by a specific signature of driver mutations and of omics profile, and with the response of the final outcome, they are generating the Cancer Big Data totem (7-9). The powerful scientific/industrial complex, generated by the previous omics phase, strongly proposes the idea that Cancer Big Data have to be further increased and are going to be the main field in which the Moonshot projects should move. Big Data and Artificial Intelligence are predicted by many actors to be the field in which the war to cancer would be finally won (10). Big Data and Artificial Intelligence for precision oncology The identification of somatic genetic alterations, obtained by genome sequencing of tumor samples of patients, is recommended to detect which drug is available to specifically match the observed genetic mutations for the more appropriate treatment of each patient. The first successful example of this approach is given by the treatment with imatinib (a chemical inhibitor of tyrosine kinase) of patients of chronic myeloid leukemia, which was found to mightily improve, in a durable way, the clinical outcome for this disease (11). In many other cases, such as in lung cancers carrying mutant epidermal growth factor receptor (EGFR) or melanomas bearing mutated BRAF, a component of the K-Ras signaling pathway, the treatment with the specific, targeted inhibitor gave in most cases a transitory benefit, being followed by the insurgence of innate or acquired resistance (12-15). Especially drugs that target signaling pathways are liable to become ineffective, due to signaling cross-talk, either spontaneous or derived by mutations given by the intrinsic genomic instability of tumors. Computational genomics has also explored the possibility to detect specific sets of different mutations occurring in any given patient, and to correlate these signatures with more effective therapy. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 6 August 2018 doi:10.20944/preprints201808.0127.v1


Introduction
Almost fifty years ago the "war on cancer" was declared by President Nixon, and an enormous amount of studies on cancer has been carried on afterwards, not only in USA, but worldwide.Nevertheless today it is urgently felt the need for a "Moonshot on cancer" in USA and a similar project is proposed to be set up in the Framework Program 9 of EU, with the aim to "ensure the survival of three out of four patients by 2030" (1), thereby indicating how helpless we still feel today against this disease.In fact, notwithstanding the encouraging words from clinicians and scientists, cancer remains a disease that endangers too many lives (2) and brings deep sorrow to most families in the world.
So far, the war on cancer has been technology-driven: at first, recombinant DNA technologies brought the identification of oncogenes and tumor suppressor genes, then genomic sequencing and transcriptome analysis have allowed to characterize both the set of mutated oncogenes and the very complex and variable transcriptome expression profiles, which appear to characterize each different cancer cells, both in human cell lines and in clinical samples (3)(4)(5)(6).
Enormous amounts of omics data of different kinds have been collected by refined analytical technologies, analyzed by a variety of clever bioinformatic tools, stored and made available for retrieval in qualified data banks.When connected with information on which drugs have been used for each patient, that was characterized by a specific signature of driver mutations and of omics profile, and with the response of the final outcome, they are generating the Cancer Big Data totem (7)(8)(9).The powerful scientific/industrial complex, generated by the previous omics phase, strongly proposes the idea that Cancer Big Data have to be further increased and are going to be the main field in which the Moonshot projects should move.Big Data and Artificial Intelligence are predicted by many actors to be the field in which the war to cancer would be finally won (10).

Big Data and Artificial Intelligence for precision oncology
The identification of somatic genetic alterations, obtained by genome sequencing of tumor samples of patients, is recommended to detect which drug is available to specifically match the observed genetic mutations for the more appropriate treatment of each patient.
The first successful example of this approach is given by the treatment with imatinib (a chemical inhibitor of tyrosine kinase) of patients of chronic myeloid leukemia, which was found to mightily improve, in a durable way, the clinical outcome for this disease (11).In many other cases, such as in lung cancers carrying mutant epidermal growth factor receptor (EGFR) or melanomas bearing mutated BRAF, a component of the K-Ras signaling pathway, the treatment with the specific, targeted inhibitor gave in most cases a transitory benefit, being followed by the insurgence of innate or acquired resistance (12)(13)(14)(15).Especially drugs that target signaling pathways are liable to become ineffective, due to signaling cross-talk, either spontaneous or derived by mutations given by the intrinsic genomic instability of tumors.Computational genomics has also explored the possibility to detect specific sets of different mutations occurring in any given patient, and to correlate these signatures with more effective therapy.
While the few successes of the computational genomic approaches in clinical oncology have been heralded by popular press, giving new hope to cancer patients, the results of clinical trials, rigorously performed and reporting much less satisfactory results on the average population of patients, almost escaped the attention of public policy decision-makers (16).In fact, notwithstanding these results, which should suggest caution, the fact that the cost of sequencing of an entire human genome is around 1000 US $ and promises to be reduced in the near future, has encouraged to extend computational genome analysis to large numbers: the UK project "Genomics England" has assembled a collection of 100.000 patients with cancer or rare diseases.In the USA, the initiative "All of US" aims to collect data on one million people, following them over time, aiming to detect early markers of serious diseases.China is carrying on a 10 billion US $ project to sequence and analyze one million people.Besides one should consider the fact that it is not completely settled the issue of the possibility that personal genomic health data could be used in ways to break the right of patients' privacy, or be exploited for commercial use, outside the health-related declared purpose of data collection (17).
Given the previously discussed limitations of computational genomics in oncology, other strategies have been explored.The one that presently receives great attention by the press is a technologically innovative approach which relies on Cancer Big Data (CBD).Cloud-based supercomputers have been trained with enormous amounts of data: from literature data to data from thousand patients given by extensive molecular characterization of tumors, medical reports, drug treatments that have been performed for each specific condition, with the resulting clinical outcome.These data are organized and compared to the clinical guidelines, followed for any given set of diagnostic markers and based on the experience accumulated in selected top-quality hospitals (Fig. 1).

Figure 1 Concept map of an Artificial Intelligence approach
Use of Big Data for the identification of the best drug treatment for patients in a precision medicine perspective.
When asked about a new patient, the Artificial Intelligence-analysed CBD (18) may elaborate the answer utilizing the data specific for that patient, compared with the bulk of information previously stored and gives a set of recommendations for the treatment.The reliability of these indications has not yet been tested by rigorous clinical trials and while successful indications get big headlines, it is not clear whether this approach would really improve the positive outcome in patients treatment on a large scale.
Many movies and science fiction books have diffused and made common sense the idea that Artificial Intelligence will, very shortly, become able to elaborate and generate knowledge in a faster and more reliable way than human mind.Without discussing, for the moment, how likely it will be the realization of this prediction, let us focus on the present basic question: why is it so difficult to cure, with efficacy, cancer patients?Cancer is a complex, very heterogeneous disease, characterized by the occurrence of several common physiological hallmarks (19)(20), differentially assorted in various cancer types.Many driver mutations, thousand of differentially expressed genes, sizable numbers of biomarkers, several typical histological traits and differential cell physiology features are among the known differences between normal cells and their counterpart cancer cells.While well established by an enormous number of literature papers, the role of mutations in oncogenes and tumor suppressor genes (4) ( 21), together with changes in the surrounding stroma (22)(23) in sustaining cancer phenotypes have not been yet structured to generate a satisfactory theory of cancer, able to link all these aspects in a rational, causal, dynamically predictive way.
The big problem with computational genomics and with the more sophisticated Big Data approaches is that they yield statistically valid results (Fig. 1).More or less, they rely on historical records on how a large number of patients, characterized by given genetic signatures, responded to a defined therapy.A typical result of these statistical analysis may find that, for instance, patients with an X signature have responded well, with more than 5 years survival in the 70% of cases, to treatment Y.This information does not assure that an oncoming patient with the X signature will respond positively to treatment Y.So we are very far from a personalized, precision medicine!While statistical results are useful when investigating socio-economical events (e.g.political pools, consumer market trends), they are not good enough, as indicated above, in setting the foundations for personalized medicine, given that statistical correlations are not able to discovery the molecular regulatory mechanism of complex biological processes.On the other hand, this is the only reasonable approach available at the moment to try to select drugs for more effective treatments, so it may be ethically correct to use it, for the time being.
The recognized poor individual predictability of computational genomics in finding the best available therapy for cancer patients has prompted to develop functional assays on viable primary tumor cells, obtained from patients and challenged in vitro with a large number of drugs (16).
In summary, a new strategy needs to be devised, able to discover how to extract the logic of cancer from the enormous complexity of experimental data, so to offer a rational basis for the tailor-made selection of drugs presently available and, more importantly, to develop new drugs able to eradicate this disease in its various forms.

Biological complexity cannot be solved only by Artificial Intelligence
Complex biological processes, such as cancer, result from the dynamic, often non-linear, interactions, defined in time and space, among a large number of macromolecules together with small regulatory molecules, which generate an organic whole, able to carry on specific functions, coherent with operational objectives, the so called emergent properties.To this type of complexity Warren Weaver, co-author with Claude Shannon of the theory of communication, which has generated the digital revolution, still unfolding, has given the name of organized complexity.Writing a seminal paper in 1948 (24), Weaver proposed that the understanding of biological complexity would require both sophisticated analytical technologies to quantitate, in great detail, the components involved in a biological function and then the use of tools of analytical mathematics to construct models, whose simulation analysis would shed light on the functionality of the system in steady states and under perturbations.He stressed the idea that a new organization of science, with large multidisciplinary teams, would be required to meet this new challenge.It is fascinating that, before the discovery of DNA double helix, Weaver predicted the change of scientific paradigm from reductionism, that flowered in the DNA-centric molecular biology, to systems biology, which is starting to tackle biological complexity at the beginning of the 21 st century.Why do we need systems biology to understand complex biological functions, such as cancer, and do not rely, instead, on artificial intelligence manipulation of molecular measurements on an omics scale of various tumors?Machine learning methods applied to qualified public databases have been shown to be able to generate plausible hypotheses on the molecular regulatory mechanisms which are in action (25).Unfortunately, the number of possible hypotheses obtained in this way is extremely large and some sort of selection is needed, before to subject a restricted number of these hypotheses to experimental test (25).
It has recently received attention the development of a neural network able to master the game of Go, without being trained by supervised learning from human experts (26).It is a very interesting achievement, but we have to recall that the game rules have been introduced into the program, which has been allowed to undergo self-training.Although Go is a complex game, it develops in a limited system in which the basic rules determine a large, but finite, number of possible moves, which may be used in different assortment to implement winning strategies.
Thus it seems ill-advised the idea, sustained by a number of actors in the field, that Big Data and Artificial Intelligence alone are going to become able to overcome human ability in scientific reasoning, by extracting from the deluge of experimental data available on cancer, which may still lack relevant information and may contain findings irrelevant in terms of regulatory mechanisms, explanatory hypotheses on the laws that determine the features of the different types of this disease.
The reason is that, contrary to the game of Go, we do not know yet the basic rules of the game for cancer: the theory of somatic genetic mutation (27) or the recognition of a discrete number of hallmarks (19)(20) are not adequate to quantitatively constraint the system, nor we have selection criteria to discard the very many inappropriate possible solutions generated by Artificial Intelligence approaches to choose the correct one.
For this reason, as predicted by Weaver, we need the models of systems biology, so to become able to utilize the great potentialities of Augmented Intelligence, which is given by all the tools of Artificial Intelligence, guided and integrated by human scientific ideas and experiments, in a system approach to reach, more efficiently and in a shorter time, the solutions of the many riddles that complex biological functions, from cancer to neurodegeneration, present to us.

Tackling biological complexity of cancer
The first issue to face is how to handle cancer biological complexity.It has been widely accepted the proposal to disassemble cancer phenotype in several hallmarks (19) (20).Some of them refer to the relations of cancer cells with their environment: avoiding immunoresponse, tumor-induced inflammation, inducing angiogenesis, invasion and metastasis, evading growth suppression.Others are relevant for the time-dependent development of tumors: genomic instability and mutations, resisting cell death.The remaining two: sustained proliferation signaling and deregulation of cellular energetics, are clearly connected with the more characteristic, and the only really dangerous, property of cancer cells: their unrestricted proliferation (28) (29).So it is very appropriate to start applying a system approach to investigate the molecular events that support unrestricted cancer cell proliferation.A further feature of interest of this property is that, contrary to many other hallmarks, it can be quantitatively measured both in vitro (Fig. 2A) and in vivo, feature that is going to be useful, as we shall see in the following, when we would like to construct a predictive, quantitative model of this crucial property.The unrestricted proliferation of cancer cells requires a change in metabolism called Cancer Metabolic Rewiring (CMR), variable in various cancer cells, but characterized by enhanced glycolysis (Warburg effect) and by a pathway of glutamine utilization, distinct from that of normal cells, involving a reductive carboxylation step from α-ketoglutarate to isocitrate (30)(31).As a consequence, cancer cells present a characteristic profile of nutrient utilization, employing larger amounts of glucose and glutamine than normal cells, and secreting large quantities of lactate but small amounts of glutamate (normal cells utilize less glutamine and secrete equivalent amount of glutamate).This means that in cancer cells biomass is mostly made from glutamine, while glucose fermentation to lactate generates ATP (46).
CMR is induced by oncogene activations and has been indicated as an interesting new drug target (32)(33)(34)(35).Hence, it becomes very relevant to better understand the rules, or design principles, that govern CMR.The availability of a mathematical model of metabolism, able to yield predictions on the behavior of the system following genetic and/or nutritional perturbations, is going to be essential to discover the rules that govern CMR.Systems biology mostly relies on three types of models.First, networks approaches, which yield a representation of genetic or functional connections among different gene products.Networks are quite popular (36), but they are very limited in predictive ability.Second, quantitative dynamic models, typically represented by differential equations, that are very interesting, since they yield predictions on the behavior of a system after perturbation, but they are difficult to construct for large networks, because they require the knowledge of kinetic parameters of the system, difficult to be determined.A success, recently obtained in this field is the multi-omics construction of personalized kinetic models of erythrocytes, which present a much reduced metabolic network, as compared with nucleated mammalian cells (37).Third, the constraint-based models, that are models of flux and therefore very appropriate for metabolism.Interestingly, they do not require the knowledge of parameters, but only the stoichiometry of the involved reactions.Thanks to 150 years of biochemical studies, we know the stoichiometry of all reactions (about 7400) found in human metabolome, with about 5000 total number of metabolites (38).Genome-wide metabolic models (GWMM) have been reported (39) and metabolic flux analysis, which interprets 13 C metabolic labelling patterns, have been described (40)(41).
The availability of constraint-based models and of analytical techniques to monitor metabolic fluxes in cancer cells has allowed my lab to investigate and start to discover the rules governing cancer cell metabolism.Time-course transcriptome analyses of cancer cell as compared to their normal counterparts have shown that several thousand genes are differentially expressed in cancer cells (42) (43).Comparison of genomic expression patterns, in humans and in mouse, identified, by multivariate statistical methods, clustering and pathway analysis tools, a cancer-related, shared transcriptional profile (44).This cancer-related profile is given by the expression of 314 genes, largely overexpressed in cancer, for which a protein-protein interaction analysis was performed, by considering only established physical interactions.Thus a network of 156 nodes (proteins) was identified.Functional classification by KEGG and Gene Ontology analysis allows the identification of few cellular processes, present in the shared cancer network, being the mitochondria structure and function the more relevant one.In fact, many components of the mitochondria respiratory chain (with a noticeable abundance of protein components of Complex I), of the mitochondria oxidative phosphorylation reactions and of mitochondria membrane transport are detected (44).
In conclusion, network analysis is able to sort out putative regulatory components acting in a complex biological process.As it will be shown later on, many more experimental investigation and the construction, simulation analysis and experimental validation of a predictive mathematical model are needed to polish the unfinished indication given by the network, to reach a first understanding of the molecular mechanism underlying the process.
Experimental findings evidence a strong reduction of Complex I activity in mitochondria of cancer cells as compared to normal ones (45).The metabolism of mitochondrial dysfunctional cancer cells exhibit enhanced glycolysis with lactate production, reduced oxidative flux through TCA cycle and increased utilization of glutamine by reductive carboxylation for anabolic syntheses (46).This interplay of experimental and computational investigations sets the ground to investigate the rules, the design principles of CMR.

Understanding the design principles of cancer metabolic rewiring
Given that we are interested in the pathways of utilization of glucose and glutamine to yield biomass, we extracted from the genome-wide metabolic models Recon 2 and HMR only the reactions that are involved in this central area, compacting them when appropriate.In this way we made the computational effort less demanding by not considering the large number of admissible wirings present in GWMM, which are not engaged in biomass production.A systematic constraintbased simulations of this core model (called ENGRO1) elucidated the interplay of glucose, glutamine and oxygen availability in setting the active wirings (47).At high utilization rates of glutamine, oxidative utilization of glucose is reduced, while the production of lactate from glutamine is increased.This emergent phenotype, corresponding to CMR, is detected only when the available glucose exceeds the amount that could be fully oxidized by the available oxygen (either due to a bottleneck in the electron transfer chain or to a reduced tension of oxygen at the level of mitochondria)(Fig.2B).CMR is sustained by a fairly large number of redox-controlled metabolic pathways (48) and it is optimal to maximize biomass and ATP production; it requires the activity of a branched TCA cycle.A large number of possible different pathways may generate a CMR phenotype, offering a basis to at least part of metabolic heterogeneity detected in cancer cells (Fig. 2C).Reduction of glutamine availability reduces CMR and allows the utilization of glucose through standard TCA cycle, with lower production of biomass (47).
Thus the ENGRO1 model (validated by experimental findings) has finally allowed to discover the rules that govern cancer cell growth, to enlight its design principles.They are defined by two boundary conditions: enhanced uptake of glucose and of glutamine, in presence of a bottleneck in the electron transfer chain to oxygen and are characterized by the following features: glutamine is utilized by a branched TCA cycle and may be converted to lactate; glucose goes to lactate even in presence of oxygen, a large number of different, redox-controlled pathways may generate CMR.Both the activation of oncogenes (49) and alterations of the microenvironment (50) may generate conditions that promote CMR and hence unrestricted cancer cell growth (Fig. 3).Cancer cells receive nutrients and oxygen from the environment, and use metabolism to generate building blocks, to synthesize cellular macromolecules (DNA, RNA and proteins).Macromolecules have also a strong informational role in determining cell functions.The kind of information and material fluxes in the cell is determined by the decoding of external environment status and internal status.This process involves a complex signaling network, which sense external (e.g.nutrient supply, tumor micro-environment) and internal signals (e.g.AMP/ATP, NADH/NAD + , AcetylCoA levels, etc.) CMR is also affected by the mitochondrial genome and translational machinery, and by the cross-talk between mitochondrial and nuclear genome.It is not surprisingly therefore that an enormous number of perturbations (genetic and/or biomolecular) may affect cancer cell growth.
The paper by Damiani et al ( 47) is of great interest since it offers both a logic explanation of the insurgence and maintenance of cancer cell growth and a mathematical model able to predict the effects of changes in nutrients supply, as well as alterations in the activity of specific enzymes (due to genetic, epigenetic or post-translational modification events), may have on biomass formation.
At this point we may start talking about "Systems Metabolomics", defined as the investigative approach that conjugates various metabolomics techniques with the constraint-based modeling approaches.Metabolic profiles, as well as transcriptome or proteome analyses of expressed genes in each kind of cells, may initially define the system, computational modeling, eventually expanded, respect the ENGRO1 model, to consider other players of the system, will predict the metabolic pathways active in the system.Experimental metabolic pathway identification and metabolic flux analysis will gather findings to evaluate the reliability of model predictions.

Systems Metabolomics for precision medicine
The essential aim in precision medicine is to become able to treat each patient with a presently available drug (or combination of drugs), which should be the most effective one in providing cancer free, longer term survival.As recently discussed by Letai (16), the idea that detection of somatic genetic alterations may directly bring to identify which drug matches the observed phenotype has not brought, so far, significant clinical benefits, although an number of scattered successes have been reported.
Also assays of gene expression have been proposed: for instance, determination of gene expression analysis performed on a small set of genes have been reported for primary breast cancer tumours which were hormone-receptor positive, humans epidermal growth factor receptor (HER2) negative, axillary-node negative (a condition fairly common in early-detected breast cancer patients).This gene expression analysis is performed on 21 genes, 16 of which code either for HER2 pathway or for proliferation markers and for metastatic markers, while 5 are used for individual normalization (51).The Recurrence Score (RS) is derived from individual reference-normalized expression on a 0-100 scale.RS values smaller than 11 are taken to indicate low risk of recurrence when treated only with endocrine therapy, while RS greater than 25 predict a chemotherapy benefit to prevent recurrence (52).The large majority of patients presents intermediate values of RS, and also in this case a prospective trial supports the notion that endocrine therapy alone is not inferior to chemoendocrine therapy in predicting disease-free survival (52).
On the other hand, standard endocrine therapy (with tamoxifen treatment) shows that about 40% of initially responsive patients acquire resistance and face poor survival outcome (53).Acquired resistance to anti-estrogen therapy is a long term process, not detectable in primary tumor cells, in which both genetic alterations, due to genomic instability of cancer cells (20), and epigenetic changes take place.Activation of alternative growth factors pathways, followed by alteration of metabolism, cell cycle and survival pathways reactivate the unrestricted proliferation ability of tamoxifen resistant cancer cells (54).
Therefore genomic precision medicine approaches, testing either genomic mutation signatures or expression profiles of a set of marker genes, show significant limitations.Letai (16) proposed to move to functional precision medicine by testing the sensitivity to various drugs ex vivo on primary tumor cells, obtained from each individual patient.These cells should grow in 2D or 3D cultures to measure their sensitivity to drugs (or combinatorial treatments) as ability to grow or as induction of pro-apoptotic signals.Since growth rate of primary tumor cells is quite slow, one may find significant, faster responses by measuring the metabolic profile response to drug treatments (55).
The potentiality of systems metabolomics for cancer precision medicine is taken to be much more articulated and of greater impact than that of the previously described monitoring of drug effect on growth of primary tumor cells.This capability depends upon the fact that metabolism plays a central role in living cells by supporting different relevant activities: utilization of nutrients to provide energy and building blocks for growth and proliferation, protection against external stress by generating appropriate homeostatic response.While in microbial cells about the 50% of proteome is allocated to metabolism, with 25% dedicated to glycolysis alone, in human cells, in which there is an average increase of about 100 fold of cellular protein content as compared to microbial cells, the aliquot of proteome dedicated to central carbon metabolism, energy metabolism, biosynthetic metabolism and other enzymes, still account for about 20% of the proteome, being very conspicuous the aliquot dedicated to transcription, translation, folding, sorting and degradation, and to cytoskeleton, while much smaller is the quantitative relevance of signaling, of course observing differences in the comparison of various types of human cells (56).
Hence metabolism has to be considered a powerful actor of cell biology, in fact almost any perturbation in cellular physiology is taken to be characterized by a specific metabolic fingerprint, that is by reproducible changes in specific areas of metabolome analysis both at the cellular level and in the plasma (57).Of course these changes may be quite small and be detectable only using appropriate technical approaches (57).The interpretation of these metabolome fingerprints is going to be mightily improved by the previously described constraint-based models (47) and more so by the derived understanding of design principles that govern CMR.It is of interest that, as predicted by CMR design principles, OXPHOS defects due to mtDNA mutations are able to induce glutamine anaplerosis, also in human muscle cells (58).
In order to use metabolic fingerprinting for diagnostic or prognostic investigations, it would be useful to develop a data bank of metabolic profiling of different cancer cell lines and patientderived xenografts (PDXs) with different proliferative activity in vitro and aggressiveness in xenografts (Fig. 4A).Not only, one may want to find out how systemic environmental processes (59) may affect CMR or how biophysical forces generated by mechanical or bioenergetics events (60-61) are able to activate or modify CMR.Data integration by a wealth of bioinformatics tools determines the correlations between each class of metabolic fingerprinting and sensitivity to a set of drugs, trying to infer the mechanism of drug action from the simulation analysis of the personalized models (E).Primary tumor cells from an upcoming patient will be assayed for metabolomics fingerprinting (F), then tested against the correlation software (G) to find the best available first-line drug treatment.A further relevant issue is given by the fact that tumors, which share the same organ localization and have similar histological features, but different sets of biomarkers, may have very different clinical aggressiveness and outcome.Cancer cell lines, derived from these tumors, often maintain the differential aggressiveness, as growth rate in vitro (Fig. 4A) and in xenografts.It would be of great interest to identify the metabolic pathways, which sustain, for instance, the differential aggressiveness of various breast cancer cell lines (62) or of lung cancer cell lines, so to identify which molecular pathways are at the basis of this very relevant clinical feature.A number ot cancer metabolic inhibitors have been constructed and are assessed for their potential therapeutic role (63) (Fig. 4B).Finally, metabolomics profiling (Fig. 4C) could be structured, by statistical analyses, in various classes, which could be tested for sensitivity response to various available drugs (Fig. 4D), alone or in combination, in vitro or in xenografts.Data integration using a variety of statistical and modeling tools may allow to compare the class of metabolic profiling with the indication of the most effective drug, or drug combination (Fig. 4E).Metabolomics analysis of primary tumor cells from each patient (Fig. 4F) will be tested against the correlation matrix identified in Figure 4E, so to offer a useful support to clinicians, for the choice of the first treatment (Fig. 4G).

Systems Metabolomics and anti-cancer drug discovery
It is common understanding that pharma industry needs to deeply re-engineer the process of drug discovery and development: pharma are presently unable to produce really effective drugs for major multifactorial diseases like cancer or neurodegeneration, at acceptable costs for families and National Health Systems (64).
A recent survey on the top ten highest selling drugs in USA shows that for each patient that respond with complete satisfaction to the treatment, a range of 4 to 25 patients, who receive the same drug do not benefit from the treatment (65).Some cultural reasons may explain the unsatisfactory situation: the first is that pharma strategic plans do not completely realize that multifactorial diseases are sustained by complex biological processes, in which a large number of molecules interact generating the function as emergent property.Second, the complex biological process which underlays the disease is affected by patient variability, due to genetic, epigenetic causes and to life style experiences, therefore affecting the response to drugs.
The drugs presently available for cancer treatment are generally cytostatic, blocking signaling pathways that are promoting growth.A positive development may come by considering CMR as a system which fuels growth and to devise a (combinatorial) strategy to stop it, relaying on both experimental and computational approaches of systems metabolomics for identification (Fig. 5A).For the first an opening has been achieved, but a much larger analysis is requested for the second, a preliminary network identification is described, but many aspects are not yet clear (for instance, the role of plasma membrane depolarization); the third one is the least known, but worth of deeper understanding, given that it could generate drugs able to stop the basic cause of relapse.See text for more details.
A second line for drug discovery may consider another strategy: the response to glucose deprivation, that brings cancer cells to death by mitochondrial collapse, affects a large number of cancer cell lines, but not normal cells.To be able to activate this process specifically in cancer cells by a drug treatment should offer the possibility not only to restrain growth of cancer, but to eradicate the disease.Of course it is necessary to fully understand the metabolic and bioenergetic responses that are generated, only in cancer cells, by glucose deprivation.Figure 5B presents a concept map of the signaling pathways, centered on mitochondrial PP2A, which have been recently regarded to control glucose deprivation-induced cancer cell death (66,67).Glucose withdrawal causes rapid membrane depolarization and an influx of Ca 2+ , which activates a pathway given by kinase CAMK1 and demethylase PPME1.The following activation of PP2A promotes cell death (66).Besides glucose starvation is followed by an increase of the α-ketoglutarate to fumarate ratio, which favour PHD2 activation that promotes cell death by degrading B55, the inhibitory subunit of PP2A (67).For other cancer cells, a glutamine starvation MYC-dependent apoptosis has been described (68).Also this pathway may be of interest for drug discovery.
Another promising system is given by the mutator phenotype of cancer cells (Fig. 5C), in which DNA damage response (DDR), mismatch repair, aneuploidy and similar processes play a role, increasing the mutation load of developing tumors (69).Given that trajectories of mutations appear to follow detectable patterns (70,71,72) and that nutrient-sensing and metabolic pathways have been reported to interfere with DDR (73), also this biological process may open new ways to drug discovery.
The realization is now ripe that research have to move away from oncogene as the only valid drug targets, but to identify new properties, unique of cancer cells, to generate new anticancer therapies (74).In this note I am focusing on new properties linked to metabolism, which is a the crossroad of growth and death of cancer cells.In this context one has also to put attention to the possibility that diet may interfere with drug treatment (75).

Perspectives
The case story, presented in this paper, offers indications useful to develop a strategy aiming to investigate biological complexity.Initially, it shows that to collect cancer-related large sets of heterogeneous, but logically connected, Big Data and to treat them with the more sophisticated algorithms of Artificial Intelligence, allows only to discover statistical correlations between, for instance, signature of oncogenic mutations and drug response, of some use, in absence of more reliable indications, for precision medicine, but unable to give the certainty to give the best available drug to each patient.The more so, this kind of analysis does not allow to obtain a molecular, mechanistic and predictive understanding of the biological complexity of cancer, the real aim of scientific knowledge.The quest for a rational predictive theory of cancer has been pursued since many years (76) and the findings, discussed in this note -showing that the unrestricted growth, a cancer property, is dependent upon a process (metabolism) that can be rationalized by a predictive model -offer a stepping stone for this long standing hunt.
To move towards this goal, it has been shown to require a systems approach, in which biomolecular scientists work in tandem with computer scientists, exploiting the power of computing machines.Various steps characterize this strategy, starting from a reduction of complexity, in which the large cancer phenotype is disassembled in various interacting modules (20), to move to the recognition of the core module, responsible of the more relevant property: in the case examined in this note, the unrestricted cancer cell proliferation (28).Omics analyses of this property, subjected to Artificial Intelligence approaches, from data mining to pattern recognition, from neural network to machine learning, together with several ad hoc experimental studies, are able to identify the large functional molecular network, which supports the core function.In the case of unrestricted cancer cell proliferation, the core function has been shown to be generated by a cancer metabolic rewiring (CMR).This fact allows to construct a constraint-based model, whose innovative investigation (47) has brought to the identification of design principles, yielding the rules which govern the basic property of cancer cells.
The knowledge of the design principles and the availability of a mathematical model, able to predict the effects of various perturbations on the behavior of CMR, will be able to guide further investigations to elaborate, for instance, on the definition of the complex molecular regulatory network impinging on CMR (Fig. 3); on the modulation of transcription by metabolites generated by CMR; on the role of mitochondrial DNA and protein synthesis in the development of CMR (Fig. 3); on the connections of various facets of CMR with functional property, like aggressiveness.The design principles, described in Fig. 2 and Fig. 3, are, for instance, able to give a rational interpretation to the recently reported anti-cancer effect of antibiotics, able to block bacterial (and mitochondrial) protein synthesis dependent on 70S ribosomes (77).It is interesting to note that Figure 3 indicated the presence of a circular causality: not only the linear causality from DNA to proteins and to cancer phenotype, but a circular network in which signals coming from metabolism modify, by epigenetic modification, the flow and quality of expressed genetic information, which, in turn may alter metabolic fluxes.
Metabolic rewirings are hypothesized to underlay aging (78), metabolic syndrome (79), nutrition and wellness (80).The construction of the corresponding models and the discovery of their design principles will substantially advance our understanding in these challenging areas.In the future, the integration of metabolic rewirings with Cell Atlas findings may offer a more profound understanding of human cells, as we have a more profound appreciation of an iconic building from its pictures and blueprints.
In conclusion, the alliance of human intelligence with artificial intelligence appears to be necessary for the development of the science of biological complexity, which is going to mark our future and, as a consequence, the organization of scientific research will profoundly change.No more data generation groups on one side and computer scientists on the other side, trying to extract the meaning out of data on functions they do not deeply understand from end-to-end, but mixed teams of different domain-specific professionals, as predicted by Weaver (24) and recently discussed by Aoun (81), inspired by the knowledge of design principles and relying on predictive models, coordinated by a "quarterback", as suggested by Aoun (81), able to draw all threads together and oversee the team of specialists to reach the targeted mission.Something like the organization of large architectural design and engineering firms, that are changing the skyline of our cities.Science, like art, requires a new environment to nurture human creativity, with the ability to generate new concepts and ideas and to see new patterns, utilizing the enormous potentiality of augmented intelligence, so to give sustainable answers to societal challenges.
The editorial assistance of Massimiliano Borsani is gratefully acknowledged.
Comments and suggestions of Marta Bertolaso have mightly improved this note.

Figure 2 A
Figure 2 A systems metabolomics approach to identify design principles of Cancer Metabolic Rewiring (CMR), which sustains enhanced cancer cell proliferation.A) In vitro growth curves of normal and tumor cells, showing the enhanced cell proliferation of cancer cells B) CMR sustaining enhanced cancer cells proliferation (from Damiani et al.PLoS Comput Biol 13 (9), e1005758.2017 Sep 28).Characteristic features: stimulated glucose and glutamine utilization.Branched TCA cycle.Glutamine utilization by reductive carboxylation producing citrate, that generate enhanced lipid production.Production of lactate, both from glucose and glutamine.Enhanced production of biomass.Many different pathways, redox controlled, may generate CMR.C) Identification of the rules governing CMR: requirement of boundary conditions (see text).Very large variability of redox-controlled pathways which generate CMR.

Figure 3
Figure 3 Concept map of the complex regulatory network affecting cancer metabolic rewiring.Material fluxes are in blue, information fluxes in orange.Cancer cells receive nutrients and oxygen from the environment, and use metabolism to generate building blocks, to synthesize cellular macromolecules (DNA, RNA and proteins).Macromolecules have also a strong informational role in determining cell functions.The kind of information and material fluxes in the cell is determined by the decoding of external environment status and internal status.This process involves a complex signaling network, which sense external (e.g.nutrient supply, tumor micro-environment) and internal signals (e.g.AMP/ATP, NADH/NAD + , AcetylCoA levels, etc.) CMR is also affected by the mitochondrial genome and translational machinery, and by the cross-talk between mitochondrial and nuclear genome.It is not surprisingly therefore that an enormous number of perturbations (genetic and/or biomolecular) may affect cancer cell growth.

Figure 4
Figure 4 Concept map of the use of metabolomics fingerprinting for drug selection in precision oncology Set up a large collection of human cancer cell lines and Patient-derived Xenografts (PDXs).Each sample will be tested for: growth rate in vitro (A);  determination of sensitivity to a panel of drugs -single or in combination -both in vitro (B) and in xenografts (D)  systems metabolomics analysis (C); Data integration by a wealth of bioinformatics tools determines the correlations between each class of metabolic fingerprinting and sensitivity to a set of drugs, trying to infer the mechanism of drug action from the simulation analysis of the personalized models (E).Primary tumor cells from an upcoming patient will be assayed for metabolomics fingerprinting (F), then tested against the correlation software (G) to find the best available first-line drug treatment.Figure C1 is derived from Valtorta et al., Oncotarget.2017; 8:113090-113104.Figure C2 is derived from Gaglio et al., Mol Syst Biol.2011 Aug 16;7:523.Figure C3 is derived from Damiani et al., PLoS Comput Biol 13 (9), e1005758.2017 Sep 28.
Figure 4 Concept map of the use of metabolomics fingerprinting for drug selection in precision oncology Set up a large collection of human cancer cell lines and Patient-derived Xenografts (PDXs).Each sample will be tested for: growth rate in vitro (A);  determination of sensitivity to a panel of drugs -single or in combination -both in vitro (B) and in xenografts (D)  systems metabolomics analysis (C); Data integration by a wealth of bioinformatics tools determines the correlations between each class of metabolic fingerprinting and sensitivity to a set of drugs, trying to infer the mechanism of drug action from the simulation analysis of the personalized models (E).Primary tumor cells from an upcoming patient will be assayed for metabolomics fingerprinting (F), then tested against the correlation software (G) to find the best available first-line drug treatment.Figure C1 is derived from Valtorta et al., Oncotarget.2017; 8:113090-113104.Figure C2 is derived from Gaglio et al., Mol Syst Biol.2011 Aug 16;7:523.Figure C3 is derived from Damiani et al., PLoS Comput Biol 13 (9), e1005758.2017 Sep 28.

Figure 5
Figure 5 Approaches for a new drug discovery strategy targeting cancer-relevant modular processes The most interesting cancer modules for this purpose appear to be: A) various redox-controlled specific CMR B) pathways of cell death induced by glucose deprivation C) pathways of cancer mutator phenotype The level of understanding of these different targets and the availability of mathematical models are very different.For the first an opening has been achieved, but a much larger analysis is requested for the second, a preliminary network identification is described, but many aspects are not yet clear (for instance, the role of plasma membrane depolarization); the third one is the least known, but worth of deeper understanding, given that it could generate drugs able to stop the basic cause of relapse.See text for more details.