3. Results
We conducted a detailed analysis of the data collected from the 237 cancer patients who were diagnosed with SARS-CoV-2. This process involved examining a range of clinical, radiological, and demographic variables to gain insights into the interplay between their oncological conditions and the progression of COVID-19. By focusing on this specific cohort, we aimed to identify patterns and factors that contribute to the severity of the disease in these high-risk patients, providing a foundation for predictive modeling and improved clinical decision-making.
The study evaluated the performance of various ML models in predicting disease severity in this category of patients. The results highlighted the effectiveness of KNN and ensemble bagging methods as the top-performing models, achieving exceptionally high accuracy rates of 100% and 98.3%, and high AUC scores. (
Figure 1,
Figure 2)
ECOC represents a technique that handles multi-class classification by breaking it down into multiple binary classification tasks and combining the results. KNN is a classification algorithm that predicts the label of a sample based on the labels of its nearest neighbors in the feature space.
In the confusion matrix, rows represent the ‘True Classes’ of the samples, columns represent the ‘Predicted Classes’ by the model, diagonal values represent correctly classified instances for each class and off-diagonal values represent misclassifications.
Below we depicted a comprehensive breakdown of the matrix:
1. Class 0: 14 instances of Class 0 were correctly predicted as Class 0. No instances of Class 0 were misclassified as other classes.
2. Class 1: 51 instances of Class 1 were classified as Class 1. No instances of Class 1 were misclassified. The model successfully distinguished these instances from other classes.
3. Class 2: 94 instances of Class 2 were correctly classified as Class 2. No instances of Class 2 were misclassified as any other class, which suggests that the model performed well in identifying Class 2.
4. Class 3: 76 instances of Class 3 were classified as Class 3. Similar to the other classes, suggesting the model identified correctly all of the classes.
This further supports the model's accuracy and reliability in distinguishing between the defined severity levels.
The results from the ECOC with KNN model demonstrate its exceptional performance in classifying all instances correctly across all defined categories (Class 0, Class 1, Class 2, and Class 3). The results from the ECOC with KNN model demonstrate its exceptional performance in classifying all instances correctly across all defined categories (Class 0, Class 1, Class 2, and Class 3).
These results indicate that the ECOC framework combined with the KNN algorithm is highly effective for this dataset. KNN, being a non-parametric model, classifies data based on proximity to training examples in feature space. The lack of misclassification across all classes suggests the dataset’s features are well-separated, and the model successfully captures the relationships between them.
Such flawless performance, while impressive, also calls for scrutiny. Perfect accuracy could point to potential overfitting, especially if the dataset is relatively small or lacks external validation. However, within the scope of the study, these results highlight the model's suitability for clinical applications where precision in predicting disease severity is crucial.
Both models demonstrated superior ability to correctly identify severe cases, which were characterized by factors such as ICU admission, the need for mechanical ventilation, and the presence of metastatic cancer. These findings underscore the capability of these advanced algorithms to capture complex relationships between multiple clinical and demographic variables, making them valuable tools for critical decision-making in healthcare.
The Decision Tree model also showed promising results, with an accuracy of 82.55%. While not as precise as KNN or ensemble bagging, it still effectively classified patient outcomes. Decision Trees offer the advantage of interpretability, providing clinicians with a transparent understanding of how predictions are made. This feature makes them an attractive choice for integrating ML tools into clinical workflows, where explainability is often as important as predictive accuracy. (
Figure 3)
In this case, the model correctly predicted ‘Class 0’ (mild severity) for 7 patients. It correctly predicted ‘Class 1’ for 40 patients. The model did well for ‘Class 2’ (85 correct predictions) and ‘Class 3’ (62 correct predictions). Regarding the errors that occurred, the model made wrong predictions. For example, it predicted ‘Class 0’ when the true class was ‘Class 1’ three times, or it predicted ‘Class 2’ instead of ‘Class 3’ 11 times. The model performs well overall and misclassifications are relatively rare.
The receiver operating characteristic (ROC) curve measures how well the model distinguishes between severity levels (Classes 0, 1, 2, 3) and AUC measures how good the model is at distinguishing between classes. Here, AUC is high for all classes: Class 0: 0.95, Class 1: 0.96, Class 2: 0.93, Class 3: 0.95. This indicates the model performs well across all severity levels. The model also has high sensitivity (True Positive Rate) and low errors. (
Figure 4)
The Decision Tree model performed well, especially for ‘Class 2’ and ‘Class 3’, but had some misclassifications between neighboring classes. The model has strong predictive ability for all severity levels, with high AUC scores close to 1.
In contrast, the performance of SVM and Discriminant Analysis was moderate, with accuracy rates of 64.26% and 65.11%, respectively. These models struggled to match the predictive precision of the top performers, likely due to limitations in handling the non-linear relationships inherent in the dataset. While they provided some utility, their lower accuracy and AUC scores suggest they may be less suitable for tasks requiring high reliability in identifying severe cases.
The performance of the ECOC model employing a SVM classifier in predicting disease severity across four distinct classes highlights the model's proficiency in identifying intermediate and severe cases, but shows room for improvement in minimizing misclassification between adjacent severity levels. (
Figure 5)
The matrix presents the true versus predicted classifications, providing an insight into the model's strengths and limitations:
1. Class 0 (Mild cases): Out of all actual mild cases, six were classified as Class 1, an equal number (6) were misclassified as Class 2, and two as Class 3. These results suggest a low level of precision in identifying Class 0, with a degree of confusion between adjacent classes.
2. Class 1 (Moderate cases): For Class 1, 22 instances were correctly predicted as belonging to this category. However, a significant portion (28) was misclassified as Class 2, while one was classified as Class 3. This confusion indicates that features separating mild and moderate cases might overlap, leading to classification challenges.
3. Class 2 (Severe cases): The model excelled in identifying severe cases, with 83 instances correctly classified as Class 2. Nevertheless, some overlap occurred, with seven cases being underestimated as Class 1 and four as Class 3. This suggests that while the model distinguishes severe cases well, there is a tendency to confuse them with adjacent categories.
4. Class 3 (Critical cases): Class 3 had 46 correct predictions, but a notable 29 instances were misclassified as Class 2, and one as Class 1. This indicates that while the model is adept at identifying critical cases, it occasionally struggles with the distinction between severe and moderate ones.
The AUC values from ROC Curve Analysis collectively validate the ECOC-SVM framework's robust discriminatory capacity. However, the classification of boundary classes (e.g., between moderate and severe, or severe and critical) requires further optimization to reduce overlap and enhance predictive precision. (
Figure 6)
Class 0 (AUC = 0.82): indicates a reasonable ability to identify mild cases, consistent with the confusion matrix results. Class 1 (AUC = 0.83): Slightly higher AUC reflects better performance in predicting moderate cases compared to mild ones. Class 2 (AUC = 0.81): This score highlights the model's overall reliability in identifying severe cases, despite occasional misclassification with neighboring classes. Class 3 (AUC = 0.88): The highest AUC value demonstrates the model's strength in recognizing critical cases, aligning with its ability to identify most ICU-level patients accurately.
The findings highlight the potential of the ECOC-SVM model in addressing critical challenges of severity classification in healthcare. While the high AUC scores confirm its applicability, the misclassification patterns suggest areas for refinement. Adjusting features, incorporating additional clinical parameters, or experimenting with alternative encoding schemes within the ECOC framework may further enhance performance.
These results underscore the importance of leveraging ML models in medical decision-making. By reliably predicting severity levels, this approach could support clinicians in prioritizing high-risk patients and allocating resources more effectively, contributing to better outcomes in a high-stakes clinical setting.
Concerning Discriminant Analysis, the model demonstrates strong performance for Class 2, evidenced by its high number of correct predictions and minimal misclassifications. There is noticeable confusion between neighboring classes (e.g., Class 0, misclassified as Class 1 and vice versa, as well as Class 1 being misclassified as Class 3). These misclassifications may stem from overlapping features between adjacent severity classes. (
Figure 7)
1. Class 0: The model incorrectly classified 7 instances as Class 1, 6 instances as Class 2, and 1 instance as Class 3, suggesting that misclassifications are notable.
2. Class 1: Among Class 1, the model correctly classified 25 cases, but misclassified 25 instances as Class 2 and 1 instance as Class 3. This balanced accuracy for Class 1 highlights some confusion in distinguishing it from Class 2.
3. Class 2: The model showed strong performance for Class 2, correctly classifying 82 instances. Only 9 cases were incorrectly labeled as Class 1, and 3 as Class 3. This consistency demonstrates the model's ability to identify Class 2 effectively.
4. Class 3: Class 3 had 46 instances correctly predicted. However, 1 instance was misclassified as Class 1, and 29 cases were labeled as Class 2. These errors highlight a difficulty in discerning Class 3 from Class 2.
Regarding the ROC Curve Analysis, the overall AUC values (ranging from 0.80 to 0.88) demonstrate the ECOC model with Discriminant Analysis has reasonable discriminatory power across all classes. Class 3 is the best-predicted category based on its high AUC score, but its confusion with Class 1 in the matrix suggests further refinement is necessary. (
Figure 8)
1. Class 0: AUC = 0.84. This indicates the model has a good, though not perfect, ability to distinguish Class 0 from other classes.
2. Class 1: AUC = 0.84. Similar to Class 0, the model has decent discrimination for Class 1, despite the confusion noted in the confusion matrix.
3. Class 2: AUC = 0.80. Although Class 2 had strong results in the confusion matrix, its slightly lower AUC reflects challenges in consistently distinguishing it from other classes in certain scenarios.
4. Class 3: AUC = 0.88. The highest AUC among the four classes, indicating the model is most effective in identifying Class 3 compared to other classes.
The ECOC model with Discriminant Analysis exhibits strong capabilities, particularly for Class 2 and Class 3 predictions. While the overall accuracy is commendable, certain misclassifications, especially between neighboring classes, highlight areas for potential improvement. The relatively high AUC values across all classes underscore the model’s reliability, but adjustments to better differentiate between closely related classes could further enhance performance.
Naive Bayes was the poorest-performing model, achieving an accuracy of only 40%. This outcome can be attributed to the model’s assumption of feature independence, which does not align with the complex, interdependent nature of the clinical, radiological, and demographic factors influencing disease severity in cancer patients with COVID-19. The poor performance highlights the limitations of overly simplistic approaches when applied to multifaceted datasets. (
Figure 9)
1. Class 0: misclassified - 14 instances were classified as Class 2, indicating a poor precision for Class 0.
2. Class 1: incorrectly predicted 51 instances of Class 1 as Class 2, underlining another class confusion, as depicted in previous cases.
3. Class 2: correctly predicted - 94 instances of Class 2 were correctly identified as Class 2. No misclassification occurred, highlighting a strong model performance for Class 2.
4. Class 3: misclassified - 76 instances of Class 3 were incorrectly classified as Class 2.
The model heavily favors Class 2, with all misclassifications from other classes (0, 1, and 3) being assigned to Class 2. While Class 2's precision and recall are perfect, this suggests overfitting to Class 2 or poorly defined decision boundaries for the other classes. The inability to predict Classes 0, 1, and 3 correctly (complete misclassification for Class 0 and Class 3, and significant confusion for Class 1) indicates weaknesses in feature representation or the Naive Bayes assumptions for these classes.
Regarding the ROC Curve, it shows flat lines, and the AUC values are marked as NaN (Not a Number).
The classifier outputs for Naive Bayes may not generate meaningful probability estimates for each class and the dataset may have issues, such as imbalanced classes, leading to an inability to compute meaningful True Positive Rate and False Positive Rate. (
Figure 10)
Concisely, the confusion matrix suggests that the Naive Bayes model heavily predicts Class 2 regardless of the input data. This overgeneralization reduces the performance for other classes, especially Classes 0, 1, and 3. A significant portion of the predictions are skewed toward Class 2. This could point to a class imbalance in the data or the inability of Naive Bayes to distinguish features for other classes in the dataset. Classes 0, 1, and 3 are almost entirely misclassified as Class 2, indicating that the model struggles to identify meaningful patterns for these classes. The flat ROC curve and NaN AUC values highlight that the model's predictions are not probabilistically calibrated or meaningful, likely due to poor feature separability under the Naive Bayes assumption. The ECOC with Naive Bayes approach appears inadequate in this case, especially for multi-class tasks, and calls for a re-evaluation of the underlying assumptions and preprocessing steps.
In summary, the study demonstrates the potential of advanced ML models, particularly KNN and ensemble bagging, to effectively predict disease severity in cancer patients with COVID-19. These results provide critical insights into the selection of ML tools for clinical applications, emphasizing the importance of choosing models capable of handling complex relationships to ensure accurate and reliable predictions.