ColoSTEM Dx kit: innovative alliance of cancer stem cells and glycosylation for earlier prediction of colon cancer aggressiveness and prognosis

Nowadays, colon cancer prognosis still difficult to predict, especially in the early stages. Recurrences remain elevated, even in the early stages after curative surgery. Carcidiag Biotechnologies has developed an immunohistochemistry (IHC) kit called ColoSTEM Dx, based on a MIX of biotinylated plant lectins that specifically detects colon cancer stem cells (CSCs) through glycan patterns that they specifically (over)express. A retrospective clinical study was carried out on tumor tissues from 208 non-treated and 21 treated patients with colon cancer, that were stained by IHC with the MIX. Clinical performances of the kit were determined, and prognostic and predictive values were evaluated. With 78.3% and 70.6% of diagnostic sensitivity and specificity respectively, our kit shows great clinical performances. Moreover, patient prognosis is significantly Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 4 November 2021 © 2021 by the author(s). Distributed under a Creative Commons CC BY license. poorer when the MIX staining is “High” compared to “Low”, especially at 5-years of overall survival and for early stages. The ColoSTEM Dx kit allows an earlier and a more precise determination of patients’ outcome. Thus, it affords an innovating clinical tool for predicting tumor aggressiveness earlier and determining prognosis value regarding therapeutic response in colon cancer patients.


Introduction
Colon cancer represents the second leading cause of death from cancer [1][2][3]. Diagnosis is usually based on the pathological staging classification(pTNM) (stages I to IV) [4]. Surgical resection is the only curative method at present. Although the prognosis has improved in recent years, survival rates widely vary by stage, with 85% 5-years net survival for stage I and 50% for stage III [5]. Indeed, nearly 10% of stage I, 30% of stage II and 55% of stage III will present a metachronous cancer or a recurrence (locoregional or metastatic) within 5 years postoperatively [6]. This high risk of relapse requires to further improve earlier detection of colon cancer and to achieve personalized evaluation of patients' outcome and prognosis. This approach implies a systematic and precise determination of disease aggressiveness in order to strengthen patient follow-up and management [7].
Glycosylation is one of the most important posttranslational modifications of lipids (glycolipids) and proteins (glycoproteins), by the highly coordinated action of glycosyltransferases and glycosidases. Glycoproteins and glycolipids regulate a diverse range of key biological and cellular functions, including differentiation, proliferation, growth, pluripotency etc... Alterations in glycosylation processes (i.e. aberrant glycosylation) are linked to colon cancer development, progression, metastases and therapeutic failures [8][9][10][11]. Aberrant glycosylation constitutes a hallmark of Cancers and might even lead to the acquisition of a stemness phenotype [12].
Few data reported a correlation between the alteration of glycosylation processes with the induction and/or regulation of CSCs phenotype and properties. Overexpression of O-glycan truncated forms such as Tn antigen (Ag), is involved in the development and the induction of colon oncogenic features (tumorigenesis, cell growth, invasion, metastases and resistance to UV-induced apoptosis) [27]. The expression of ß-1,4-Nacetylgalactosaminyltransferase 3 is upregulated in colonospheres and its knockdown decreases sphere formation and CSCs marker expression (OCT-4 and NANOG) [28]. Overexpression of α-2, 6-Sialyltransferase and α-N-acetylgalactosaminide α-2,6sialyltransferase 1, are both correlated with (i) colon CSCs enrichment (increase of CD133 and ALDH1 expressions, as well as sphere forming ability), and (ii) acquired resistance to chemotherapy (irinotecan and 5-Fluorouracil) and EGFR-targeted therapy (gefitinib) [29][30][31][32]. FUT9 gene encoding the α-1,3 fucosyltransferase, plays a complex dual role in colon cancer development and malignancy. Alpha-1,3 fucosyltransferase knockdown strongly decreases sphere formation, growth of xenograft tumors and expression of OCT-4 and CD44, whereas it increases cell proliferation and migration. FUT9 expression supports colon cancer aggressiveness. Indeed, its expression at early stages is required for CSCs expansion and colon cancer initiation. On the contrary, FUT9 downregulation at later stages promotes colon cancer progression [33]. Most of colon CSCs surface markers are glycoproteins. They differ from their normal counterpart by the expression of tumor specific glycans [8,26]. CD44 splice variants carry oncofetal carbohydrate T and sialyl-Tn (sTn) Ag, correlating with increased metastatic potential of colon cancer cells [34]. Moreover, rather than the expression of total CD133 protein, it is the expression of a specific glycan epitope (AC133) which could constitute a "bona fide" CSCs marker [25]. Altogether these data suggest that, a better characterization of colon CSCs glycosylation profiles could pave the way to identify more efficient new CSCs biomarkers in order to improve its specific detection within tumor and thus for targeting them.
Based on these knowledges and current clinical needs, Carcidiag Biotechnologies company has developed the ColoSTEM Dx kit, consisting in specific colon CSCs detection within heterogeneous tumor cell populations. There are currently no clinically standardized way (i.e., efficient prognosis biomarkers) to provide reliable and earlier prognosis of colon cancer patients. In this context, our kit provide innovating and reliable biomarkers, specific to colon CSCs, for a better and an earlier stratification of low-or highrisk patients to develop an aggressive disease and relapse. The ColoSTEM Dx kit represents a tool perfectly adapted to the personalized management of patients. More precisely, it is an innovative tool that uses a MIX of biotinylated plant lectins (UEA-1, Jacalin and ACA, mixed in a particular ratio) recognizing glycan patterns specifically (over)expressed by CSCs only. This glycan patterns are not expressed by "normal" stem cells or differentiated cancer tumor cells. This colon CSCs specific MIX was evidenced by lectin-arrays and validated in vitro from research works carried out in collaboration with the University of Limoges, that have conducted to file two patents (national registration numbers WO2016FR53196 and WO2016FR53197).
Based on these results and in order to validate the ColoSTEM Dx kit for a routine clinical use by IHC, a retrospective clinical study was conducted in collaboration with Limoges University Hospital, on tumor tissues from non-treated (N=208) patients with colon cancer. Prognostic values of the MIX staining at 5-and 7-years of overall survival (OS) were validated according to clinicopathological data (i.e., stages, age and gender). CSCs-specific glycosylated patterns evidenced by the MIX can be considered efficient biomarkers for a more accurate prediction of colon cancer patients' aggressiveness, outcomes and prognosis, compared to the non-specific SCs biomarker, OCT-4. With 78.3% and 70.6% of diagnostic sensitivity and specificity respectively, the ColoSTEM Dx kit constitutes an innovating tool usable in clinical practice in a standardized way, to provide a more accurate prognosis and to better stratify low-and high-risk patients for relapses. Finally, preliminary results obtained from treated patients' tissues (N=21) showed that the ColoSTEM Dx kit could also constitute a promising predictive tool for evaluating therapeutic response.

Evaluation of EpCAM High immunostaining and ALDH1 bright activity by Flow Cytometry
EpCAM high cell percentages within the AC133 -and + sorted-cells, were analyzed by flow cytometry (FCM) from 5.10 4 cells. After a saturation in 1%BSA in DPBS 1X calcium and magnesium free (10min, 4°C), cells were incubated for 45min at room temperature (RT) with an EpCAM mouse monoclonal Ab (clone VU1D9; Ozyme -Cell Signaling Technology, France) diluted at 1:150 in 1% BSA/DPBS. After a washing step in DPBS 1X (g x 300, 10min, 4°C), cells were incubated for 30min at RT in the dark with an Alexa-Fluor 633-conjugated goat anti-mouse secondary Ab (ThermoFisher Scientific, France) diluted at 1:1000 in DPBS 1X.
Enzymatic activity of ALDH1 in MIX+ and MIX-sorted-cells, was analyzed from 10 5 cell/mL, using the ALDEFLUOR kit (Stem Cell Technologies, France) according to the manufacturer's recommendations.
Forty-six colon tumor tissues were collected from non-treated patients with colon cancer, who underwent tumor resection at the Department of Digestive Surgery, General and Endocrine Surgery at Limoges University Hospital (France). Necrosed (N=4) were excluded (N=42).
Twenty-four tumor tissues were also collected from treated patients with colon cancer at Limoges University Hospital (France). Necrosed (N=3) were excluded (N=21).
Supplementary colon tumor tissues were collected from two TMA constituted in a cohort of non-treated colon cancer patients with early stages (I and II) from the "Centre de Ressources Biologiques -Institut Régional du Cancer Montpellier (CRB-ICM, Montpellier, France, ICM-CORT-2016-26). Necrosed and absent or non-interpretable tumor tissues (N=15) were excluded (N=80). Both TMAs also includes N=50 non-tumor samples.
Clinicopathological data including pTNM stages, gender, age and survival status at 5-and 7-years were provided after baseline examinations and diagnosis of the patients which was based on histological analyses of biopsies, according to the American Joint Committee on Cancer staging manual [37]. Survival rates analysis of non-treated and treated patients from all stages (refer to "statistical analysis" described below) were realized respectively from N=128 and N=21 samples (Tables S1 and S2). Survival rates analysis at 5 years of non-treated patients from early stages (refer to "statistical analysis" described below) were realized from N=70 samples (N=29 from CRB-ICM, N=27 from Limoges Hospital and N=14 from the TMA (AMSBIO) ( Table S3).

MIX and OCT-4 IHC immunostaining
MIX staining was realized on N=208 and N=21 tumor tissues from respectively nontreated and treated patients with colon cancer (refer to "patients and samples" section; Tables S1, S2 and S3). MIX/OCT-4 co-staining was performed on some tumor tissues from the non-treated patients' cohort, i.e., N=42 tumor tissues from Limoges University Hospital (refer to "patients and samples" section; Table S1). Each staining was realized by IHC on paraffin-embedded histological sections (4μm in thickness), in three main steps using the Leica Bond Max automatic staining platform (Leica Biosystems, France), according to the manufacturer's instructions.
1. Preparation and pretreatment of the tissues. Paraffin coating is removed using the Bond Dewax Solution (Leica Biosystems, France) and tissues are rehydrated under heat using the acidic buffer Bond Epitope Retrieval Solution 1, for 5min (pH 6; Leica Biosystems, France).
2. Immunostaining. Activity of endogenous peroxidases and biotins was blocked using the Bond Intense R Detection kit (Leica Biosystems, France). Tissues were incubated for 20 min with either the Mix alone pre-diluted at 1:2 ratio in the diluent supplied in the ColoSTEM Dx kit, or with both MIX (1:2) and OCT-4 (OCT-4 polyclonal rabbit IgG Ab; ThermoFisher Scientific, France). MIX and/or OCT-4 staining was revealed using respectively the Bond Intense R Detection kit and the Bond Polymer Refine Red Detection kit (Leica Biosystems, France). Nucleus were counter-stained by incubation with hematoxylin (Leica Biosystems, France) for 8 min.
3. Slides mounting. After dehydration by two successive baths of absolute ethanol (VWR, France) and toluene (ThermoFisher Scientific, France), for 5min each, tissue slides were mounted using the Leica CV Ultra (Leica Biosystems, France) and examined under the Leica photomicroscope DM4 B (Leica Biosystems, France; 200x magnification).

Scoring method
MIX staining appears in brown at apical membrane and/or in cytoplasm. OCT-4 staining appears in red/pink within nucleus (in blue) and/or cytoplasm. Scoring method of both staining was adapted from Ohara Y. et al. [38]. All tissues were stained either with the MIX alone, or with both the MIX and OCT-4. The total absence of staining (score 0) or the presence of stained cells constitutes the first element of analysis. The second element of analysis is related to the proportion of stained cells that is scored according to the followed gradation: 1 = 1-25%, 2 = 26-50%, 3 = 51-75% and 4 = 76-100% of stained cells. The third element of analysis is related to the staining intensity, graduated as followed: 1 = Low, 2 = Medium and 3 = High staining intensity. Scores obtained from both gradations are then added together and the total some obtained result into 6 intermediate scores, ranging from 2 to 7, that are finally grouped into 3 final scores ( Figure 1). Final scores of 1 and 2 are considered as "Low staining" (MIX-Low and/or OCT-4-Low) and final score of 3 is considered as "High staining" (MIX-High and/or OCT-4-High).

Evaluation of clinical performances of the ColoSTEM Dx kit
Clinical performances of the ColoSTEM Dx kit, were determined from N=166 tumor tissues (N=86 of the commercial TMA (AMSBIO) and N=80 of the cohort of CRB-ICM Montpellier) and N=136 non-tumor tissues (N=86 tumor borders of the commercial TMA (AMSBIO) and N=50 non-tumor samples of the cohort of CRB-ICM Montpellier). Diagnostic sensitivity is related to the percentages of tumor tissues stained with the Mix (true positives) relative to the unlabeled ones (false negatives), as followed Sensitivity (%) = 100 x (True positives / (True positives + False negatives)). Diagnostic specificity is related to the percentages of tumor borders or non-tumor tissues unstained with the Mix (true negatives) relative to the labeled ones (false positives), as followed Specificity (%) = 100 x (True negatives / (True negatives + False positives)).

Statistical analysis
Statistical analysis and graphics were performed with StatView 5.0 (USA), Prism 7 (GraphPad, USA) and R environment (version 4.0.3). Statistical analysis of in vitro clonogenicity assay was made with an ANOVA/ANCOVA test. Survival rates according to MIX/OCT-4 co-staining were analyzed from the non-treated patients from the Limoges Hospital cohort, at 5 years, i.e., only patients whose survival is  60 months at the last visit time, were retained (N=42; refer to "patients and samples" and Table S1). Survival rates of all non-treated patients were analyzed according to their clinicopathological data and Mix staining at (i) 5 years and (ii) 7 years, i.e., patients whose survival is  84 months at the last visit time (respectively N=79 and N=128; refer to "patients and samples" and Table  S1). Survival rates analysis at 5 years of non-treated patients from early stages were achieved by combining three cohorts composed of patients from Limoges' hospital, CRB-ICM Montpellier as well as from a cohort provided by a AMSBIO (N=70; refer to "patients and samples" and Table S3). Survival rates of treated patients were analyzed according to MIX staining, at 5 years (N=21; refer to "patients and samples" and Table S2). The prognostic value of each parameter for outcome was assessed using the Kaplan-Meier method and log-rank test (Mantel-Cox). For each variable, hazard ratio (HR) was estimated using a univariate Cox model and expressed with their 95% confidence interval (95% CI). Multivariate analysis was carried out using a Cox regression model on single features identified by the univariate Cox modeling. Survival analysis were performed in R using survival and survminer packages. The proportional hazards assumption for Cox regression model fit was verified using cox.zph function of survival package. A p-value below 0.05 was considered as significant.

The ColoSTEM Dx kit allows efficient isolation and enrichment of a cell subpopulation with stemness properties.
The ColoSTEM Dx kit originates from research works carried out by the EA3842 laboratory (Limoges' University; patent national registration number 1561763publication number 3044680). It aims at colon CSCs specific detection in both heterogeneous cell populations and tumors. Indeed, it is based on the use a MIX of biotinylated plant lectins that recognize glycan patterns specifically (over)expressed by these cells, i.e., not by differentiated cancer cells, with heterogeneous tumor colon tissues. ColoSTEM Dx kit proofs of concept, i.e., MIX evidence by lectin-arrays and its validation in specific colon CSCs detection and enrichment from several colon adenocarcinoma cell lines (including HT-29), are reported in detail in the patent mentioned above.
Main and most relevant results have been recalled in Figure S1. Briefly, HT-29 cells were sorted by MACS, with either the MIX (MIX+ and MIX-sorted-cells) or an AC133 Ab (AC133+ and AC133-sorted-cells). Some of CSCs characteristics and properties were then evaluated: protein expression and enzymatic activity of stem cell markers (EpCAM and ALDH1), and sphere forming ability. While there are as many EpCAM High cells in AC133+ sorted-cells as in AC133-, there are 7.5-times more EpCAM High cells in MIX+ sorted-cells, compared to both MIX-(normalized to 1) and AC133+ cells ( Figure S1A). Consistently, there are 4.7-times more ALDH1 bright cells in MIX+ sorted-cells (74.73%) than in MIX-(15.6%) ( Figure S1B). Finally, MIX+ sorted-cells have a significant capacity to form spheres compared to MIX-cells, even when seeded at low densities ( Figure S1C).
The use of the ColoSTEM Dx kit is more efficient for detecting and enriching in specific colon CSCs than cell sorting using AC133. These results suggest that ColoSTEM Dx kit improves CSC detection and cell sorting. .

The ColoSTEM Dx kit improves colon CSCs detection and allows a more accurate prognosis than the standard stem cell marker OCT-4
MIX specificity in colon CSCs detection by IHC on tumor tissue, as well as its efficiency in patients' prognosis evaluation, were evaluated and compared to a standard stem cell marker, OCT-4. MIX and OCT-4 staining were realized on N=42 tumor tissues from non-treated patients (Table S1 and Figure S2). Among stained samples, half of samples are MIX-Low tissues or MIX-High ( Figure S2A), while OCT-4-high staining is present in a broad panel of samples (almost 80%, Figure S2B) suggesting that OCT-4 is not able to discriminate CSC from the heterogeneous cell subpopulations. In addition, when cells are double-stained with MIX and OCT-4, samples are mainly divided in MIX-Low/OCT-4-High or MIX-High/OCT-4-High ( Figure S2C). Intensity of MIX staining is not linked to OCT-4 staining and is independent of clinicopathological characteristics of patients except for gender (Table 1). However, this association is not found later on larger patient cohorts (Table 2). Altogether these results suggest that the ColoSTEM Dx kit is relevant for the discrimination of cancerous from healthy stem cells. It also evidences a better specificity to detect colon CSCs, than OCT-4 whose staining within tumor colon epithelium does not seem to be restricted to CSCs, but to all stem cells (healthy and cancer) and progenitors. Survival rates at 5 years were evaluated by Kaplan-Meier curves according to either OCT-4 staining (OCT-4-Low versus High; Figure 2A) or MIX staining (MIX-Low versus High; Figure 2B). Univariate and multivariate Cox regression were performed to estimate prognosis value and risk scores associated to OCT-4 and MIX staining ( Figure 2C). Representative pictures of MIX/OCT-4 co-staining, are depicted in Figure 2D. D.  Figure  2C, right panel). Finally, survival rates at 5 years were also evaluated according to MIX/OCT-4 costaining on same tissues (N=41), i.e., MIX-Low/OCT-4-Low, MIX-Low/OCT-4-High, MIX-High/OCT-4-Low, MIX-High/OCT-4-High. Due to not enough MIX-Low/OCT-4-Low and MIX-High/OCT-4-Low tumor tissues included (n=6 and 3, respectively; Figure S2C), Kaplan Meier curves were only depicted and analyzed for MIX-Low/OCT-4-High and MIX-High/OCT-4-High co-staining ( Figure 2E). Interestingly, and consistently with previous observations, a MIX-High/OCT-4-High co-staining predicts significant poorer and worse prognosis than a MIX-Low/OCT-4-High co-staining, with strong survival median decrease ( Figure 2E, p=0.015). High co-staining harbor a hazard ratio of 5.3 (95% CI 1.2 to 23.8, p=0.0298) in a univariate Cox regression model.

OCT-4-High
Contrary to OCT-4, MIX staining levels are closely associated with patient survival. Indeed, compared to OCT-4, MIX-High staining level improve significantly the detection and discrimination of colon CSCs. MIX-High staining might be a relevant cancer stem cell biomarker for monitoring disease aggressiveness and could be useful to establish the prognosis upon treatment.

The ColoSTEM Dx kit allows earlier evaluation of patients' disease agressiveness and prognosis, regardless of clinopathological data
In order to evaluate and confirm the prognosis value of the ColoSTEM Dx kit, all tumor tissues from non-treated patients (N=128 , Table S1) were stained with the MIX. Survival rates were evaluated at 5-and 7-years of OS, according to clinicopathological data, i.e., stages (early (I/II) and late (III/IV)), sex (men and women) and age (< and ≥ 60 years old) ( Table 2).
Among the 79 tumor tissues included for survival rates analysis at 5-years, six were excluded due to a total absence of MIX staining. Fifty-six percent and 44% correspond respectively to early and late stages. Of the 41 early stages, 41% and 59% were respectively MIX-Low and MIX-High. Among the 32 late stages, 47% and 53% were respectively MIX-Low and MIX-High. Regarding OS at 5 years, we noted that MIX-staining is independent of tumor stage (Table 2A, p=0.809).
Overall survival rates at 7 years were then analyzed from 128 tumor samples. As previously described, we exclude tumor samples without MIX staining (n=13). Among the 115 retained tumors, 63% and 37% correspond respectively to early and late stages. Of the 73 early stages, 33% were MIX-Low whereas 67% were MIX-High. Regarding the 42 late stages, half of the population were MIX-Low or MIX-High. This result suggest that whatever evaluated time point of overall survival, MIX staining is independent of tumor stages ( Table 2).  Kaplan-Meier curves and Cox regression models were displayed and patients' survival rates were analyzed at 5-and 7-years, according to MIX staining levels and stages ( Figure  3 and Table 3). At 5-years, prognostic significance for OS of MIX scoring is clearly supported by survival curve (Figure 3A, p=0.011) and univariate Cox model (HR: 2.1 with 95% CI 1.17 to 3.75, p=0.013; Table 3A). On the contrary, no statistically relevant difference is shown between MIX-Low or -High at 7-years survival rates ( Figure 3B and Table 3B). Noted that sex and age have not a significant impact on survival rates, at 5- (Table 3A and Figure S3A) and 7-years (Table 3B and Figure S3B). Kaplan-Meier curves performed on separately groups of patients, stratifying according to sex (male and female) or age (inferior or superior to 60 years old), failed to show any difference in survival rates according to low or high MIX staining (data not shown). In brief, no statistically relevant difference on survival rates between MIX-Low or -High staining was noted, regardless of age, even if a significant link was been previously identified by Chi-square test at 7 years patients' follow-up (p=0.014, Table 2). We conclude in the same way with regard to sex.  Since multivariate analysis revealed that late stage and a high-MIX score were independent prognosis factors at 5-years and at 7-years of patients' follow-up (Table 3), we chose to combine these two parameters in order to assess their impact on survival rates. We confirmed high value of MIX score as risk factor for OS, regardless of pTNM staging (Figure 4). The survival rate of MIX-high staining associated to late stage patients is significantly poorer compared to MIX-low staining with a doubling hazard ratio, observed at 5-years and at 7-years of OS ( Figure 4C, left panel). The same tendency was observed for the early stage patients, even if only results acquired at 5-years of OS were significant ( Figure 4C, right panel). Noteworthy that results are statistically more pronounced and relevant for 5-years follow-up: patients presented MIX-Low staining have a high survival rate, in comparison with MIX-High patients ( Figure 4A and C). To accurately estimate the prognosis value of MIX according to given stage, we have performed survival analysis (Kaplan-Meier and univariate Cox regression) in function of low or high MIX subpopulations at early (I/II) or late (III/IV) stage ( Figure S4). At 5 years of patients' follow-up, high MIX staining could be considered as a poor prognosis marker in early stage (HR: 3.3 with 95% CI 1.2 to 9.1, p=0.021; Figure S4A) and late stage (HR: 2.2 with 95% CI 1 to 4.6, p=0.039; Figure S4B). At 7 years of patients' follow-up, MIX prognosis benefit is lost for early stage ( Figure S4C) but is slightly maintained for late stage (HR: 1.95 with 95% CI 0.93 to 4.1, p=0.076; Figure  S4D). Altogether, these results suggest that the MIX could be considered as an efficient prognosis marker to predict disease aggressiveness from early phase post-surgery within the 5 years post resection. However, the significance of MIX prognosis value should be useful at early stage to adapt therapeutic strategy and improve patients' management. For this reason, we chose to increase the early stage subpopulation actually made up 16 stage 1 and 25 stages 2 (Table S1).
Thus, we completed a new cohort from early stage CRC tissue (Table S3) with a total of 41 patients of stage I and 29 patients of stage II at 5 years of follow-up. Survival analysis performed on this cohort confirms the High-MIX staining as poor prognosis factor. Although a slightly difference are observed between high and low MIX in early stage subpopulation (p=0.18), we show that MIX-high has a moderate bad prognosis value in early stage in univariate Cox model (p=0.18, HR=1.764 and 95% CI 0.6514-4.779; data not shown). Thus, in early stages, we tried to distinguish stage 1 and stage 2 by combining pTNM stages and MIX scoring. In this case, survival rate is collapsed in patients with stage II/high mixed compared to stage I/High MIX suggesting that the relative risk is markedly increased when high MIX staining is detected in stage 2 patients ( Figure S5). If we consider 7 years of overall survival, the ColoSTEM Dx kit does not allows prediction of disease evolution (i.e., patients' prognosis) regardless of their age or sex. However, concerning their stage, prognostic value of MIX staining appeared more reliable in the late stage of colorectal cancer patients ( Figure S4C and D). On the contrary, if we consider 5 years of overall survival, the ColoSTEM Dx kit markedly predicts disease aggressiveness and allows stratifying patients of good or poor prognosis with a high or low risk of relapse after curative surgery, especially from early stages.
Altogether, these results evidence that specific glycan motif of colon CSCs detected by the ColoSTEM Dx kit, constitute independent prognosis factor from pTNM staging and other clinicopathological data, and allows to discriminate a better or worse prognosis. Importantly, our kit allows a great prediction of colon cancer aggressiveness and prognosis, from all stages, with more pronounced values for early stages, and within the first 5 years after curative surgery.

The ColoSTEM Dx kit displays great clinical performances
Clinical performances of the ColoSTEM Dx kit, i.e., diagnostic specificity and sensitivity, have been determined from N=166 tumor tissues (N=86 from AMSBIO TMA and N=80 from the CRB-ICM Montpellier cohort) and N=136 tumor edges (N=86 from AMSBIO TMA and N=50 from the CRB-ICM Montpellier cohort). Among the N=166 tumor tissues, N=36 depicted an absence of MIX staining (false negatives) and N=130 were stained (true positives). Among the N=136 tumor tissues, N=96 depicted an absence of MIX staining (true negatives) and N=40 were stained (false negatives). According to formula described in "materials and methods" section, diagnostic sensitivity and specificity reach respectively 78.3% and 70.6%.

Specific glycan motifs of colon CSCs evidenced by the ColoSTEM Dx kit could also constitute promising predictive biomarkers
In order to evaluate predictive values of the ColoSTEM Dx kit, 21 tumor tissues from treated patients were stained with the MIX: 42.8% and 57.1% tissues were respectively Mix-Low and Mix-High (Table S2 and Figure 5A). Kaplan Meier curves were achieved with treated patients' overall survival rates at 5-years according to MIX staining levels (Low versus High; Figure 5B). It appeared that 5-years OS of the MIX-High subgroup is significantly poorer than the MIX-Low, with a strong decrease in survival median (p=0.016). Univariate Cox analysis revealed MIX score (p=0.03) and, to a lesser extent, age (p=0.066) and pTNM staging (p=012), as predictive factor for overall survival in treated patients (Table 4, left panel). Multivariate analysis confirm that MIX score is an independent prognostic factor (HR: 6.98 with 95%CI 1.1 to 44.03, p=0.0387, Table 4 right panel).  These preliminary results evidence that the ColoSTEM Dx kit might also constitute a promising predictive tool, i.e., companion test, in order to (i) allow better prediction of therapeutic responses and relapses' risk and (ii) improve therapeutic management for each patient.

Discussion
Cancer stem cells play a key role in colon cancer evolution and has major implications to cancer therapy. Currently, CSC are not used as biomarkers in clinical routine although these cells could reflect tumor aggressiveness and might be of prime importance for diagnosis or prognosis. Indeed, CSC enable cancer therapeutic resistance to conventional treatments thereby conduce to therapeutic failure. Thus, the reliable detection of CSCs from patient samples might improve future patient management and survival. Nevertheless, no kits or devices developed for clinical or translational research are currently likely to detect cancer stem cells within tissues. Furthermore, the clinical use and significance of CSC biomarkers are still restricted due to the risk of confusing detection with biomarkers expressed by adult non-cancerous stem cells [8,11]. In this context, Carcidiag Biotechnologies decided to develop a new device for detecting CSC in patient solid biopsies. ColoSTEM Dx kit, developed by Carcidag Biotechnologies, is based on the detection of glycoproteins, known to be specifically expressed at the surface of colon cancer stem cells (colon CSCs). This diagnosis tool uses a MIX of biotinylated plant lectins that recognize glycan patterns specifically expressed or overexpressed by CSCs only. Normal stem cells or differentiated tumor cells are not detected by the lectin MIX. HT-29 cell sorting of MIX-positive cells show an enrichment of EpCAM-high and ALDH1high cell subpopulation, characteristics consistent with a stem cell phenotype. EPCAM and ALDH1 activity is currently been used to define CSC populations in digestive cancers [37]. Their expressions were associated with poor prognosis in both disease-free and overall survival for colorectal cancer. In addition, MIX positive cells are highly able to form colonospheres, up to 8 times more than their negative counterpart, again reflecting the stem cell status of these cells. Based on this in vitro evaluation, the potential stem cell detection capacity of MIX was tested by immunohistochemistry on 42 CRC samples, in comparison with OCT-4 staining, a classical stem cell marker. While MIX-high staining seems restricted to a subset of tissues, OCT4-high staining is present in a broad panel of tissue. No significant association was been put in evidence between intensity of staining with both markers. These results are consistent with previous work demonstrating that although OCT-4 is considered as a pluripotent stem cell marker required to enhance the self-renewal ability, its expression was reported to be restricted in normal colon, polyp and CRC. Thus, OCT-4 analysis by using immunohistochemistry is not useful to identify cancer and has restricted interest to characterize cancer stem cells for diagnosis [38]. Thus, survival curves according low or high OCT4 staining were homogeneous, while MIX-high staining reveals a significant decrease of overall survival suggesting this biomarker is clearly more relevant for monitoring patients and might be of prime interest for patient management. High MIX staining is considered as an independent bad prognosis factor as validated by univariate and multivariate Cox regression (HR: 4.2). Poor prognosis value was confirmed by the decreased median survival of CRC patients characterizing by an OCT4-high/MIX-high staining combination. As previously mentioned, we confirmed that OCT-4 staining alone (High vs. Low) is unable to discriminate good or poor prognosis patients suggesting that OCT-4 is not likely to detect CSC from tissue samples. These results suggest that MIX staining is specific of CSCs and could be useful for its detection in tumor samples and thus could predict the presence of CSCs as well as the associated risk of recurrence. In this context, the significance of the ColoSTEM Dx kit in the detection of CSC has been assessed on several cohorts of patients in order to demonstrate that it might provide a novel prognosis tool. The MIX staining has revealed a prognosis value at early stage in the five years patient cohort. The prognosis value of MIX staining on overall survival at 5 years was confirmed but not demonstrated at 7 years suggesting that the MIX could be of prime importance at early stages of CRC and might be used to predict treatment outcome at this stage. Since the MIX staining constitute an independent poor prognosis factor, from pTNM staging and from other clinical parameters (age and gender), it could be crucial in the future management of patients. In a similar way, patients stratification using combination of MIX and stage allows improve the patient's classification. Indeed, this combined analysis highlights that high MIX staining is a marker of poor prognosis for the patient. Indeed, whatever the stage, MIX-high is associated with a poor prognosis with a HR of 3 to 8 in early or late stage, respectively, at 5 years of followup. The same trend was observed at 7 years, but it is no significant for early stage. Thus, patients from 7 years cohort are probably very good prognosis CRC. The bad prognosis value of MIX staining has been confirmed in treated patients supporting that the MIX could also be likely to predict patients treatment outcome at early stages and maybe in future to prevent tumor burden by early detection of recurrence. Even if new tools such as Immunoscore and circulating tumor DNA aid to accurately characterize patients with minimal residual disease, they don't allow to identify the specific presence of CSCs within tumor. Thus, although these approaches permit to adapt patients' management with personalized adjuvant treatment approaches, they fail to eradicate CSCs thereby increase the risk of patient recurrence. On the contrary, the ColoSTEM Dx kit is efficient to detect CSCs even at early stage tumor. It can therefore be complementary to current approaches. Nevertheless, further developments are required and will include validation in prospective multicenter interventional outcome studies in order to confirm on a wide cohort of treated patients that MIX staining has a bad prognosis value in early stages.

Conclusions
In conclusion, it appears that ColoSTEM Dx kit demonstrated its significance to detect CSCs, more efficiently than OCT-4 and could be a new tool usefully in clinical management of colon cancer, due to their potential to predict tumor aggressiveness, even on early stage tumors.

Patents
The patent under the national registration number 1561763 (N° WO2016FR53196 and WO2016FR53197 publication number 3044680 and 3044681) results from a part of the work reported in this manuscript, i.e., experimental data depicted in Figure S1.

Supplementary Materials:
The following are available online at www.mdpi.com/xxx/s1, Table S1: Characteristics of non-treated patients with colon cancer (N=128) and corresponding MIX and OCT-4 scoring , Table S2: Characteristics of treated patients with colon cancer (N=21) and corresponding MIX scoring, Table S3: Characteristics of nontreated early stages (I and II) patients with colon cancer (N=70) and corresponding MIX scoring, Figure S1: In vitro characterization of MIX (ColoSTEM Dx kit) efficiency in colon CSCs detection and enrichment, Figure S2: Distribution of tumor tissues from non-treated patients according MIX or Oct-4 or both staining, Figure S3: Association between gender and age with survival rates at 5-and 7 years, Figure S4: Association between MIX-staining with survival rates at 5-and 7 years according to early stage (I/II) or late stage (III/IV), Figure S5: Combination of pTNM staging (early stages, i.e., I and II) and MIX scoring for survival analysis at 5-years.