Submitted:
20 November 2024
Posted:
22 November 2024
You are already at the latest version
Abstract

Keywords:
1. Introduction
2. Materials and Methods
2.1. Data Acquisition and Preprocessing
2.2. Radiomic Feature Extraction
2.2.1. First Order Statistics
- Mean Intensity: The mean value of the voxels comprised within the ROI, reflecting to a great extent the average density of the tissue.
- Skewness: Characterizes the asymmetry of the voxel intensity distribution and hence, the heterogeneity of the tumor.
- Kurtosis: describing the property 'peakedness', in other words, whether a distribution of voxel intensities is bunched around or spread out from the mean.
- Energy: This is the sum of the square of the voxel values, which may be related to the aggressiveness of the tumor or its metabolic activity.
2.2.2. Shape-Based Features
- Sphericity: A measure of how spherical (round) the tumor is, where values approaching 1 depict structures near perfectly spherical. Lower sphericity values indicate that highly aggressive and invasive tumors are more characteristic.
- Compactness: Shows how much the shape of a tumor is spherical or elongated and can thus be an indication of its invasive power.
- Surface Area to Volume Ratio: This measure compares the surface complexity of the tumor to its volume. The higher the ratio, the greater the chance for irregular growth patterns—often associated with malignancy.
- Elongation: Measures the deviation in the tumor shape from a perfect sphere and may give clues of its infiltration into surrounding tissues.
2.2.3. Texture-Based Features
- GLCM: GLCM (Gray-Level Co-occurrence Matrix) features describe the frequency of pixel intensity pairs for a predefined spatial relationship. GLCM entropy may serve as an example and denotes the complexity of the variation in voxel intensities. The greater the values, the greater the heterogeneity, which is usually associated with malignancies. Other significant GLCM features include contrast, which describes the difference between high and low intensities, and correlation, which is the measure of linear dependencies between the intensity of voxels.
- GLRLM: This matrix provides the length of consecutively sharing voxels with the same intensity value in a given direction. The related features to this are the GLRLM (Gray-Level Run Length Matrix) Short Run Emphasis, which manifests the presence of small homogeneous regions inside the tumor, while the GLRLM Long Run Emphasis gives information about the boundless homogeneous regions. These will be relevant for identifying the fibrotic regions or regions bearing necrosis inside the tumor.
- Grayscale Level Size Zone Matrix: GLSZM is very similar to the GLRLM in that it quantifies regions of identical intensities; no directional information is taken into account, however. Important features are relying on this matrix: Zone Size Non-Uniformity and Large Zone Emphasis may be useful to detect the presence of large homogeneous areas, generally indicative of late disease stages.
- Wavelet features: Refer to those signal features extracted through the application of wavelet filters on images for capturing texture at multiple resolutions. This multi-scale analysis is critical in order to detect subtle patterns both at fine and coarse details, thus offering a more nuanced understanding of tumor heterogeneity.
2.2.4. Multi-Scale and Multi-Dimensional Feature Extraction
2.2.5. Innovative Aspects of Feature Extraction
- Manual and Semi-Automated Segmentation: The tumor ROIs were segmented using a great deal of care by manual delineation of expert radiologists combined with semi-automated algorithms to achieve the best precision together with reducing observer variability.
- Standardization and Reproducibility: All extraction was carried out by strictly adhering to guidelines provided by the IBSI, with the aim of having all features reproducible across different imaging systems and acquisition protocols.
- High-Throughput Feature Extraction: PyRadiomics thus enables the extraction of hundreds of features in one image effectively. High-throughput feature extraction in imaging data is of high importance to machine learning applications, especially in cases where the volume of data is immense.
- Finding the Best and Most Useful Features: Out of the 350 features that were collected, tests using forward stepwise correlation analysis and tests for test-retest variability were used to get rid of features that were duplicated or not relevant. We were able to minimize overfitting without reducing model accuracy by focusing the analysis on the most stable and informative features.
2.3. Feature Selection and Stability Analysis
2.3.1. Feature Stability and Reproducibility
2.3.2. Correlation and Dimensionality Reduction
2.3.3. Feature Selection Techniques
2.4. Machine Learning Model Development
2.4.1. Model Architecture and Choice
2.4.2. Training Process and Cross-Validation
2.4.3. Model Optimization and Loss Functions
2.4.4. Performance Metrics
2.5. Feature Importance and Interpretability
2.6. Tumor-Specific Features Relevance
3. Results
3.1. Model Performance
3.2. Radiomic Feature Importance and Model Insights
3.3. Tumor Size and Feature Relevance
3.3.1. Radiomic Feature Relevance for Small Tumors (<2 cm)
3.3.2. Radiomic Feature Relevance for Medium-Sized Tumors (2–4 cm)
3.3.3. Radiomic Feature Relevance for Large Tumors (>4 cm)
3.3.4. Comparative Analysis of Feature Importance by Tumor Size
3.4. Comparative Analysis Between Models
4. Discussion
4.1. Comparison of Model Performance with and Without Texture Features
4.2. Shape-Based Features and Tumor Progression
4.2.1. Comparative Analysis of Shape Features in Different Tumor Stages
4.2.2. Clinical Implications of Shape-Based Features in Tumor Progression
4.2.3. Stability of Selected Features
4.2.4. Impact of Feature Selection on Model Performance
4.3. Clinical Implications of Radiomic Feature Selection
4.4. Limitations and Future Directions
4.4.1. Limitations
4.4.1.1. Data Diversity and Generalizability
4.4.1.2. Feature Robustness and Reproducibility
4.4.1.3. Model Interpretability and Clinical Integration
4.4.2. Future Directions
4.4.2.1. Standardization of Imaging Protocols
4.4.2.2. Feature Robustness Across Diverse Cohorts
4.4.2.3 Development of Explainable AI (XAI) Models
4.4.2.4. Integration with Genomic and Clinical Data
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- World Health Organization. Available online: https://www.who.int/news-room/fact-sheets/detail/cancer.
- Scapicchio, C.; Gabelloni, M.; Barucci, A.; Cioni, D.; Saba, L.; Neri, E. A deep look into radiomics. Radiol. med 2021, 126, 1296–1311. [Google Scholar] [CrossRef] [PubMed]
- Wu, L.; Lou, X.; Kong, N.; Xu, M.; Gao, C. Can quantitative peritumoral CT radiomics features predict the prognosis of patients with non-small cell lung cancer? A systematic review. Eur. Radiol. 2022, 33, 2105–2117. [Google Scholar] [CrossRef] [PubMed]
- Raptis, S.; Ilioudis, C.; Theodorou, K. From pixels to prognosis: unveiling radiomics models with SHAP and LIME for enhanced interpretability. Biomed. Phys. Eng. Express 2024, 10, 035016. [Google Scholar] [CrossRef] [PubMed]
- Marcilio, W.E.; Eler, D.M. From explanations to feature selection: assessing SHAP values as feature selection mechanism. In Proceedings of the 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI); IEEE: Recife/Porto de Galinhas, Brazil, 2020; pp. 340–347.
- P. Kalendralis et al., “FAIR-compliant clinical, radiomics and DICOM metadata of RIDER, interobserver, Lung1 and head-Neck1 TCIA collections. Med. Phys. 2020, 47, 5931–5940. [CrossRef]
- L. Wee, H. J. L. Wee, H. J. Aerts, P. Kalendralis, and A. Dekker, “Data from NSCLC-Radiomics-Interobserver1.” The Cancer Imaging Archive, 2019. [CrossRef]
- “ISBI.”. Available online: https://theibsi.github.io/.
- Van Griethuysen, J.J.M.; Fedorov, A.; Parmar, C.; Hosny, A.; Aucoin, N.; Narayan, V.; Beets-Tan, R.G.H.; Fillion-Robin, J.-C.; Pieper, S.; Aerts, H.J.W.L. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017, 77, e104–e107. [Google Scholar] [CrossRef] [PubMed]
- Medical Image Analysis; Dhawan, A.P., IEEE Xplore (Online service), Eds.; IEEE Press series on biomedical engineering, 2nd ed.IEEE Press series on biomedical engineering; Second edition.; Wiley; Wiley, IEEE Press: Hoboken, New Jersey, 2011; ISBN 978-0-470-92289-7.
- Mall, P.K.; Singh, P.K.; Yadav, D. GLCM Based Feature Extraction and Medical X-RAY Image Classification using Machine Learning Techniques. In Proceedings of the 2019 IEEE Conference on Information and Communication Technology; IEEE: Allahabad, India, 2019; pp. 1–6. [Google Scholar]
- Koo, T.K.; Li, M.Y. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J. Chiropr. Med. 2016, 15, 155–163. [Google Scholar] [CrossRef] [PubMed]
- Lambin, P.; Leijenaar, R.T.H.; Deist, T.M.; Peerlings, J.; De Jong, E.E.C.; Van Timmeren, J.; Sanduleanu, S.; Larue, R.T.H.M.; Even, A.J.G.; Jochems, A.; et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 2017, 14, 749–762. [Google Scholar] [CrossRef] [PubMed]
- Parmar, C.; Leijenaar, R.T.H.; Grossmann, P.; Rios Velazquez, E.; Bussink, J.; Rietveld, D.; Rietbergen, M.M.; Haibe-Kains, B.; Lambin, P.; Aerts, H.J.W.L. Radiomic feature clusters and Prognostic Signatures specific for Lung and Head & Neck cancer. Sci. Rep. 2015, 5, 11044. [Google Scholar] [CrossRef]
- Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson Correlation Coefficient. In Noise Reduction in Speech Processing; Springer Topics in Signal Processing; Springer Berlin Heidelberg: Berlin, Heidelberg, 2009; pp. 1–4. ISBN 978-3-642-00295-3. [Google Scholar]
- Vasquez, M.M.; Hu, C.; Roe, D.J.; Chen, Z.; Halonen, M.; Guerra, S. Least absolute shrinkage and selection operator type methods for the identification of serum biomarkers of overweight and obesity: simulation and application. BMC Med. Res. Methodol. 2016, 16, 154. [Google Scholar] [CrossRef] [PubMed]
- Chen, X.; Jeong, J.C. Enhanced recursive feature elimination. In Proceedings of the Sixth International Conference on Machine Learning and Applications (ICMLA 2007); IEEE: Cincinnati, OH, USA, 2007; pp. 429–435. [Google Scholar]
- Nohara, Y.; Matsumoto, K.; Soejima, H.; Nakashima, N. Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Comput. Methods Programs Biomed. 2022, 214, 106584. [Google Scholar] [CrossRef] [PubMed]
- Raptis, S.; Softa, V.; Angelidis, G.; Ilioudis, C.; Theodorou, K. Automation Radiomics in Predicting Radiation Pneumonitis (RP). Automation 2023, 4, 191–209. [Google Scholar] [CrossRef]
- Guo, W.; Xu, Z.; Zhang, H. Interstitial lung disease classification using improved DenseNet. Multimed. Tools Appl. 2019, 78, 30615–30626. [Google Scholar] [CrossRef]
- Iranzad, R.; Liu, X.; Chaovalitwongse, W.A.; Hippe, D.; Wang, S.; Han, J.; Thammasorn, P.; Duan, C.; Zeng, J.; Bowen, S. Gradient boosted trees for spatial data and its application to medical imaging data. IISE Trans. Healthc. Syst. Eng. 2022, 12, 165–179. [Google Scholar] [CrossRef] [PubMed]
- Raptis, S.; Tsougos, I.; Theodorou, K.; Ilioudis, C. Harmonizing Radiomics and Interpretable AI: Precision and Transparency in Oncological Prognostication. In Proceedings of the 2024 IEEE International Symposium on Biomedical Imaging (ISBI); IEEE: Athens, Greece, 2024; pp. 1–4. [Google Scholar]
- Lim, W.; Ridge, C.A.; Nicholson, A.G.; Mirsadraee, S. The 8th lung cancer TNM classification and clinical staging system: review of the changes and clinical implications. Quant. Imaging Med. Surg. 2018, 8, 709–718. [Google Scholar] [CrossRef] [PubMed]
- Demirjian, N.L.; Varghese, B.A.; Cen, S.Y.; Hwang, D.H.; Aron, M.; Siddiqui, I.; Fields, B.K.K.; Lei, X.; Yap, F.Y.; Rivas, M.; et al. CT-based radiomics stratification of tumor grade and TNM stage of clear cell renal cell carcinoma. Eur. Radiol. 2022, 32, 2552–2563. [Google Scholar] [CrossRef]
- Dwivedi, R.; Dave, D.; Naik, H.; Singhal, S.; Omer, R.; Patel, P.; Qian, B.; Wen, Z.; Shah, T.; Morgan, G.; et al. Explainable AI (XAI): Core Ideas, Techniques, and Solutions. ACM Comput. Surv. 2023, 55, 1–33. [Google Scholar] [CrossRef]
- Marvin, G.; Jjingo, D.; Nakatumba-Nabende, J.; Alam, Md.G.R. Local Interpretable Model-Agnostic Explanations for Online Maternal Healthcare. In Proceedings of the 2023 2nd International Conference on Smart Technologies and Systems for Next Generation Computing (ICSTSN); IEEE: Villupuram, India, 2023; pp. 1–6. [Google Scholar]
- Suara, S.; Jha, A.; Sinha, P.; Sekh, A.A. Is Grad-CAM Explainable in Medical Images. In Computer Vision and Image Processing; Communications in Computer and Information Science; Kaur, H., Jakhetiya, V., Goyal, P., Khanna, P., Raman, B., Kumar, S., Eds.; Springer Nature: Cham, Switzerland, 2024; Volume 2009, pp. 124–135. ISBN 978-3-031-58180-9. [Google Scholar]
- Kierner, S.; Kucharski, J.; Kierner, Z. Taxonomy of hybrid architectures involving rule-based reasoning and machine learning in clinical decision systems: A scoping review. J. Biomed. Inform. 2023, 144, 104428. [Google Scholar] [CrossRef] [PubMed]
- Saxena, S.; Jena, B.; Gupta, N.; Das, S.; Sarmah, D.; Bhattacharya, P.; Nath, T.; Paul, S.; Fouda, M.M.; Kalra, M.; et al. Role of Artificial Intelligence in Radiogenomics for Cancers in the Era of Precision Medicine. Cancers 2022, 14, 2860. [Google Scholar] [CrossRef] [PubMed]








| Model | Accuracy | Sensitivity | Specificity | AUC-ROC |
|---|---|---|---|---|
| DenseNet-201 (CNN) | 92.4% | 91.6% | 93.2% | 0.94 |
| XGBoost (Radiomics) | 89.7% | 88.4% | 90.5% | 0.90 |
| Radiomic Feature | Tumor Size | Importance Score |
|---|---|---|
| GLCM Entropy | Small | 0.72 |
| GLCM Entropy | Medium | 0.81 |
| GLCM Entropy | Large | 0.88 |
| Shape Compactness | Small | 0.65 |
| Shape Compactness | Medium | 0.75 |
| Shape Compactness | Large | 0.84 |
| Surface Area to Volume Ratio | Small | 0.68 |
| Surface Area to Volume Ratio | Medium | 0.78 |
| Surface Area to Volume Ratio | Large | 0.83 |
| First-Order Mean Intensity | Small | 0.55 |
| First-Order Mean Intensity | Medium | 0.63 |
| First-Order Mean Intensity | Large | 0.71 |
| Skewness | Small | 0.49 |
| Skewness | Medium | 0.56 |
| Tumor Size | Dominant Features | Key Insights |
|---|---|---|
| Small (<2 cm) | Texture (GLCM Entropy, GLRLM Short Run Emphasis) | Texture features capture subtle heterogeneity, critical for early-stage cancer detection. Shape features are less relevant due to uniform shape. |
| Medium (2–4 cm) | Balanced (Shape Compactness, GLCM Correlation) | Both texture and shape features contribute equally. Shape irregularities begin to appear, while texture heterogeneity remains significant. |
| Large (>4 cm) | Shape (Surface Area to Volume Ratio, Shape Elongation) | Shape features dominate, capturing the irregular, invasive morphology of advanced tumors. Texture features still provide insights into heterogeneity. |
| Radiomic Feature | Mean SHAP Value (DenseNet-201) | Mean SHAP Value (XGBoost) | Permutation Importance (DenseNet-201) | Permutation Importance (XGBoost) |
|---|---|---|---|---|
| GLCM Entropy | 0.47 | 0.55 | 0.63 | 0.66 |
| Shape Compactness | 0.35 | 0.41 | 0.52 | 0.58 |
| Surface Area to Volume Ratio | 0.29 | 0.49 | 0.48 | 0.60 |
| GLRLM Run Length Non-Uniformity | 0.42 | 0.37 | 0.58 | 0.54 |
| Model | Accuracy | Sensitivity (Recall) | Specificity | AUC-ROC |
|---|---|---|---|---|
| DenseNet-201 (with texture) | 92.4% | 91.6% | 93.2% |
0.94 |
| DenseNet-201 (without texture) | 85.7% | 82.3% | 87.5% | 0.88 |
| XGBoost (with texture) | 89.7% | 88.4% | 90.5% | 0.90 |
| XGBoost (without texture) | 83.2% | 80.6% | 85.0% | 0.84 |
| Tumor Stage | Dominant Shape Features | Key Insights |
|---|---|---|
| Early Stage (I-II) | Shape Compactness, Elongation | Tumors exhibit more regular shapes; shape features less relevant but still important in certain cases. |
| Late Stage (III-IV) | Surface Area to Volume Ratio, Shape Compactness, Elongation | Tumors show significant morphological irregularities; shape features critical for detecting invasiveness. |
| Feature | Average Rank (Fold 1) | Average Rank (Fold 2) | Average Rank (Fold 3) | Stability |
|---|---|---|---|---|
| GLCM Entropy | 1 | 1 | 1 | High |
| Shape Compactness | 2 | 2 | 2 | High |
| Surface Area to Volume Ratio | 3 | 3 | 3 | High |
| GLRLM Run Length Non-Uniformity | 4 | 4 | 4 | High |
| First-Order Mean Intensity | 5 | 5 | 5 | Medium |
| Feature | Bootstrap Iteration 1 | Bootstrap Iteration 2 | Bootstrap Iteration 3 | Stability |
|---|---|---|---|---|
| GLCM Entropy | 1 | 1 | 1 | High |
| Shape Compactness | 2 | 2 | 2 | High |
| Surface Area to Volume Ratio | 3 | 3 | 3 | High |
| GLRLM Run Length Non-Uniformity | 4 | 4 | 4 | High |
| First-Order Mean Intensity | 5 | 5 | 5 | Medium |
| Model | Accuracy (Before Feature Selection) | Accuracy (After Feature Selection) | AUC-ROC (Before Feature Selection) | AUC-ROC (After Feature Selection) |
|---|---|---|---|---|
| DenseNet-201 (CNN) | 85.7% | 92.4% | 0.88 | 0.94 |
| XGBoost | 83.2% | 89.7% | 0.84 | 0.90 |
| Imaging Parameter | Impact on Feature Extraction | Impact on Model Performance |
|---|---|---|
| Slice Thickness | Affects texture-based feature consistency | Reduces model generalizability |
| Reconstruction Algorithm | Alters intensity and shape features | Increases risk of overfitting |
| Scanner Type | Introduces variability in intensity values | Decreases reproducibility |
| XAI Technique | Description | Applicability |
|---|---|---|
| SHAP Analysis | Explains feature importance for individual predictions | Useful for feature-level interpretation |
| Local Interpretable Model-Agnostic Explanations (LIME) [26] | Explains model predictions in a localized context | Helps in model transparency for clinicians |
| Grad-CAM (Gradient-weighted Class Activation Mapping) [27] | Visualizes areas of the image that influence model predictions | Suitable for deep learning interpretability |
| Decision Trees / Rule-Based Models [28] | Models that produce rules or trees, offering straightforward interpretability. | May be used as baseline models for radiomic features, offering clear, interpretable rules though potentially less accuracy than complex models. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).