3. Results
As shown in
Table 1, a total of 201 patients (median age: 73.0 years, IQR: 64.8 - 81.6) who underwent laparoscopic surgery for colon cancer were recruited. Of these, 71 (35.3%) were included in the Enhanced Recovery After Surgery (ERAS) program, while 130 (64.7%) were not. A significant statistical difference (p < 0.05) was observed between both groups; therefore, all statistical analyses were conducted separately for each group. In particular, clinical differences and outcomes may be attenuated in the ERAS group as a result of the program’s own intervention. The sample comprised 120 men (60.0%) and 81 women (40.0%), with no statistically significant differences in the sex distribution between the ERAS and non-ERAS groups. All recruited patients had tumours located in the colon and underwent laparoscopic surgery. The surgery duration was significantly longer in the ERAS group compared to the non-ERAS group (median: 262 minutes, IQR: 228–307 vs. median: 240 minutes, IQR: 217.5–286.5; p < 0.05). Similarly, patients in the ERAS group had a longer hospital stay than those in the non-ERAS group (median: 6 days, IQR: 4–11 vs. median: 4 days, IQR: 3–5.25; p < 0.05). Postoperative complications were more frequent in the ERAS group (49% vs. 21%; p < 0.001), and mortality was also higher (14% vs. 5%; p = 0.036).
Firstly, the sensitivity and specificity of the MUST score in identifying patients with higher risk of complications or prolonged hospital stays were evaluated. The MUST score was dichotomized by comparing MUST = 0 versus MUST > 0 for both ERAS and non-ERAS groups. The results are presented in
Table 2. As shown, sensitivity was consistently low (<27%) across all scenarios, limiting the usefulness of the MUST score as a screening tool. It is important to note that, in clinical practice, the commonly used threshold for the MUST score is >2; however, raising the threshold further reduces its sensitivity. Following the finding that the MUST score has limited utility as a screening tool to predict complications (low sensibility), we assessed whether body composition data derived from CT images could provide additional, independent information beyond the primary variables used in the MUST criteria (i.e., weight, height and age). To investigate this, we conducted a Variance Inflation Factor (VIF) analysis that included these three primary variables along with two CT-derived parameters: skeletal muscle area (in cm²) and average skeletal muscle radiodensity (in Hounsfield Units). To evaluate potential multicollinearity among the independent variables included in the regression model, a VIF analysis was also performed. All VIF values were below the commonly accepted threshold of 3, indicating no collinearity. Specifically, the VIF values were as follows: age (1.25), weight (2.41), height (1.62), muscle quantity (2.60), and muscle average radiodensity in HU (1.67). In particular, derived variables such as Body Mass Index (BMI), Skeletal Muscle Index (SMI), and relative muscle area (in %) were excluded from the analysis, as they represent linear combinations of other included variables (BMI = weight/height²; SMI = muscle area/height²; muscle % = muscle area/region of interest area). Including such variables would have artificially increased the VIF values and introduced redundancy into the model.
Since CT-derived variables (skeletal muscle area and average skeletal muscle radiodensity) were shown to provide independent information relative to the primary MUST variables (weight, height, and age), we evaluated each variable individually to determine their ability to identify patients at risk of poor outcomes, as defined in our study. Receiver Operating Characteristic (ROC) curve analyses were conducted for each variable, and the Area Under the Curve (AUC) was calculated. The optimal cut-off point for each variable was defined as the point with the smallest Euclidean distance to (0,1) on the ROC curve. Then, the sensitivity and the specificity were determined. The results are presented in
Table 3 and
Table 4. According to the obtained results, muscle radiodensity (in HU) demonstrated the highest overall discriminative capacity, with AUC values ranging from 0.620 to 0.692. The lowest sensitivity (57.89%) was observed in the analysis conducted for the non-ERAS group based on hospital stay, with a specificity of 78.85% and an optimal cut-off point of 34.46 HU. In contrast, the highest sensitivity (70.83%) was obtained in the analysis performed for the ERAS group according to hospital stay, with an optimal cut-off point of 36.99 HU.
In order to continue with the analysis, it was necessary to identify a single cut-off point for the Muscle HU variable and use the same value in all scenarios to allow an objective comparison. To this end, every integer value from 34 to 41 HU was evaluated as a potential threshold. For each cut-off point, we assessed its ability to identify patients with poorer outcomes —both in terms of hospital stay and postoperative complications— in the ERAS and non-ERAS groups. Sensitivity and specificity values were calculated for each threshold, considering both prognostic outcomes in both groups. Mean values were also included to support global threshold selection.
The results presented in
Table 5 indicate that lowering the HU threshold for muscle increases sensitivity for detecting patients at higher risk —both in terms of hospital stay and postoperative complications— while decreasing specificity. Based on the average values shown, 37 HU was selected as the optimal cut-off point, offering a trade-off between sensitivity (61.57%) and specificity (66.33%). Although this balance is acceptable, these values reflect only a moderate screening performance, though noticeably better than that of the MUST score.
To explore the potential impact of this threshold-based screening method on clinical outcomes, patients were classified into two subgroups (HU ≤ 37 vs. HU > 37) in both the Non-ERAS and ERAS cohorts. The previously described clinical, nutritional, and surgical variables were then compared between these subgroups. The corresponding findings are presented in
Table 6. In the non-ERAS group, a screening method based on muscle radiodensity (HU) would have enabled clearer distinctions in clinical outcomes. Patients with HU ≤ 37 had a significantly shorter surgery time (median: 236 minutes, IQR: 216.0–277.0) compared to those with HU > 37 (median: 262.5 minutes, IQR: 219.5–316.25; p = 0.037), as well as fewer postoperative complications overall (15% vs. 33%; p = 0.036). They also experienced significantly fewer severe postoperative complications (Clavien-Dindo ≥ 3) (2% vs. 14%; p = 0.016) and showed a trend toward reduced nasogastric tube aspiration (10% vs. 23%; p = 0.09). In contrast, no significant differences emerged when using the MUST score, the SARC-F, or the GLIM criteria, suggesting limited screening capacity for these methods in this context. A significant difference in ECOG performance status was observed (p = 0.017), indicating a higher ECOG value for patients with HU≤ 37. Patients with HU > 37 tended to be younger (median: 68.2 years, IQR: 61.0–76.68) compared to those with HU ≤ 37 (median: 76.3 years, IQR: 66.6–82.6; p = 0.001). In particular, there were no significant differences in cancer stage, metastasis status, or sex distribution, indicating that the two subpopulations were genuinely comparable.
It is important to note that no statistically significant differences in hospital length of stay were observed between the two groups defined by muscle radiodensity (median: 4 days, IQR: 3–12 for patients with HU ≤ 37 vs. median: 4 days, IQR: 3–5 for those with HU > 37; p = 0.73). This finding can be attributed to several factors. First, while the minimum hospital stay in both groups is 3 days, the maximum varies considerably, leading to a non-normal and skewed distribution. Although both groups share the same median (4 days), indicating that 50% of the patients were discharged within this time frame, a substantial difference is observed in the third quartile (Q3): 12 days in the HU ≤ 37 group versus 5 days in the HU > 37 group. This implies that 25% of the patients with lower muscle radiodensity (HU ≤ 37) remained hospitalized for more than 12 days. Although this difference is not statistically significant when comparing the overall distributions, it may still reflect a clinically relevant impact.
In the ERAS group, the comparison was less straightforward. Although there were no statistically significant differences in cancer stage or metastatic status, a significant difference in sex distribution suggests that these subgroups may not be fully comparable. Furthermore, between the time when the CT image was obtained (to calculate muscle HU values) and the date of surgery, the patients underwent the ERAS intervention, potentially influencing outcomes that would otherwise have remained unchanged. Despite these considerations, patients with HU ≤ 37 exhibited a significantly longer hospital stay (median: 4.0 days, IQR: 3.0–7.5 vs. median: 8 days, IQR: 4.5–13.5; p = 0.008) and a higher overall rate of postoperative complications (35% vs. 62%; p = 0.043). Although these differences did not reach statistical significance, there was a notable trend toward longer surgery duration (median: 253 minutes, IQR: 217.5–295.5 vs. median: 282.0 minutes, IQR: 248.75–330.0; p = 0.089) and increased nasogastric tube aspiration (18% vs. 41%; p = 0.064) in the HU ≤ 37 subgroup. No significant differences were found between the subgroups in MUST, SARC-F, or ECOG. However, the GLIM criteria identified a significantly higher prevalence of malnutrition in patients with HU > 37 compared to those with HU ≤ 37 (74% vs. 46%, p = 0.034).
Given the observed inconsistency between higher CT-based muscle quality (HU > 37) and increased rates of malnutrition as defined by the GLIM criteria, we conducted a post hoc analysis to explore the possible sources of this discrepancy. In our cohort, the GLIM phenotypic criterion was defined by the presence of at least one of the following: unintentional weight loss, low BMI, or low fat-free mass index (FFMI) as measured by bioimpedance.
We first evaluated the association between each individual GLIM component and CT-derived muscle radiodensity (dichotomized as >37 vs. ≤ HU), using chi-squared tests. As shown in
Table 7, no statistically significant association was found between muscle HU classification and weight loss, low BMI, or low FFMI considered separately.
Additionally, we compared continuous CT-derived variables (muscle HU, muscle area, and SMI) between patients classified as malnourished or not by each GLIM component. These results are presented in
Table 8
This analysis revealed that the GLIM classification, particularly when driven by BMI and fat-free mass index (FFMI), tends to align more closely with CT-derived measures of muscle quantity—namely muscle cross-sectional area and skeletal muscle index (SMI)—than with radiodensity.
Specifically, patients classified as malnourished according to the global GLIM criteria showed significantly lower muscle area (106.0 vs. 115.6 cm², p = 0.023) and a trend toward lower SMI (p = 0.058), while no significant difference was observed in muscle HU (p = 0.072). This pattern was consistently observed across the weight loss and FFMI criteria, where differences in radiodensity were not significant, but muscle area and SMI were lower among malnourished patients. In contrast, the BMI-based criterion revealed a paradoxical finding: patients categorized as malnourished by low BMI exhibited higher muscle HU values (46.4 vs. 39.0, p = 0.012) despite having lower muscle area and SMI. This likely reflects the limitations of BMI in oncology populations, where weight loss can be driven by fat mass reduction without concurrent deterioration of muscle quality.
Collectively, these findings support the notion that GLIM criteria and CT-derived muscle radiodensity capture distinct and complementary dimensions of nutritional and functional status. GLIM emphasizes quantity-based parameters, while muscle HU reflects tissue quality, which is associated with myosteatosis and has independent prognostic implications. This underscores the value of incorporating opportunistic CT imaging into nutritional assessment frameworks, particularly in oncologic settings where traditional markers may not adequately reflect muscle integrity.