Submitted:
10 November 2024
Posted:
11 November 2024
You are already at the latest version
Abstract
Keywords:
Introduction
| Component | Value |
|---|---|
| Theoretical Peak Floating-point Performance Total | 838 TFLOPS ((peak) |
| Base Specifications (Compute Nodes) | 2 X Intel Xeon Cascadelake 8268, 24 Cores, 2.9 GHz, Processors per node, 192 GB Memory, 480 GB SSD |
| Master/Service/Login Nodes | 10 nos |
| CPU only Compute Nodes (Memory) | 107 nos. (1928) |
| GPU Compute Nodes (Memory) | 10 (192 GB) |
| High Memory Compute Nodes | 39 nos. (768GB) |
| Total Memory | 52.416 TB |
| Interconnect | Primary: 100Gbps Mellanox InfiniBand Interconnect network 100% non-blocking, fat tree topologySecondary: 106/16 Ethernet NetworkManagement network: 16 Ethernet |
| Storage | 1PB PFS based storage |
Literature Review
Methodology
Data Collection and Preparation
Model Architecture
|
Input: Set of x-ray images. Output: Prediction whether the x-ray image indicates presence of kidney cancer (binary output: cancerous or non-cancerous). |
| Step 1: Data Preparation |
|
| Step 2: Model Architecture: Layer Setup |
|
| Step 3: Compilation |
|
| Step 4: Training |
|
| Step 5: Model Evaluation and Prediction |
|
Model Training
- I).
- Conv2D: The first Conv2D layer has 32 filters of size 3x3 and uses ReLU (rectified linear unit) activation. It also specifies the input shape of the image. Subsequent Conv2D layers have 64 and 128 filters respectively, also using 3x3 filters and ReLU activation. These layers increase the depth of the feature maps as we progress deeper into the network.
- II).
- MaxPooling2D Layers: These layers reduce the spatial dimensions (width and height) of the input volume for the next convolution layer. They perform down-sampling by dividing the input into rectangular pooling regions and outputting the maximum of each region. We used a pool size of 2x2. This helps reduce computation, as well as helps make feature detectors more invariant to small changes in position.
- III).
- Flatten Layer: After the last MaxPooling layer, a Flatten layer is used to convert the 3D feature maps into 1D feature vectors. This layer prepares the spatially arranged data into a form suitable for classification.
- IV).
- Dense Layers: These are fully connected layers that follow the Flatten layer. They perform classification based on the features extracted and down sampling. A Dense layer with 512 units and ReLU activation is serving as a fully connected layer that can learn non-linear combinations of the high-level features as represented by the flattened vector from the previous layer. The final Dense layer with 1 unit and a sigmoid activation function, which is typical for binary classification tasks. This layer outputs a probability indicating the presence or absence of kidney cancer.
- Epochs: The PreKiCan was trained for 250 epochs, allowing sufficient time to learn from the data thoroughly. An epoch represents one complete pass of the training dataset through the algorithm.
- Batch Size: A batch size of 32 was chosen. This size is a balance between training speed and model performance, providing a stable gradient update during training.
- Optimizer: The Adam optimizer was employed for its efficiency in handling sparse gradients and adaptive learning rate capabilities, which are beneficial for converging faster to the optimal weights.
- Loss Function: Binary crossentropy was used as the loss function, which is standard for binary classification tasks. It measures the performance of the model whose output is a probability value between 0 and 1.
Validation and Testing
Computational Environment
Results & Discussion
| Measure | Value | Derivations |
|---|---|---|
| Recall (Sensitivity) | 0.999 | |
| Specificity | 0.997 | |
| Precision | 0.997 | |
| Negative Predictive Value | 0.999 | |
| False Positive Rate | 0.003 | |
| False Discovery Rate | 0.003 | |
| False Negative Rate | 0.001 | |
| Accuracy | 0.998 | |
| F1 Score | 0.998 | |
| Matthews Correlation Coefficient | 0.996 | |
| Macro-F1 | 0.998 |


Conclusion
Acknowledgements
Statements and Declarations
References
- Vogelzang, N. J., & Stadler, W. M. (1998). Kidney cancer. The Lancet 352, 1691–1696. [CrossRef]
- Chow, W.-H., Dong, L. M., & Devesa, S. S. (2010). Epidemiology and risk factors for kidney cancer. Nature Reviews Urology 2010, 7, 245–257. [CrossRef]
- Ganti, S. , & Weiss, R. H. (2011). Urine metabolomics for kidney cancer detection and biomarker discovery. Urologic Oncology: Seminars and Original Investigations, 29(5), 551–557. [CrossRef]
- Kim, K. , Aronov, P., Zakharkin, S. O., Anderson, D., Perroud, B., Thompson, I. M., & Weiss, R. H. (2008). Urine Metabolomics Analysis for Kidney Cancer Detection and Biomarker Discovery. Molecular & Cellular Proteomics, 8(3), 558–570. [CrossRef]
- Kind, T., Tolstikov, V., Fiehn, O., & Weiss, R. H. (2007). A comprehensive urinary metabolomic approach for identifying kidney cancer. Analytical Biochemistry 363, 185–195. [CrossRef]
- Morrissey, J. J., London, A. N., Luo, J., & Kharasch, E. D. (2010). Urinary Biomarkers for the Early Diagnosis of Kidney Cancer. Mayo Clinic Proceedings 2010, 85, 413–421. [CrossRef]
- Kim, K. , Taylor, S. L., Ganti, S., Guo, L., Osier, M. V., & Weiss, R. H. (2011). Urine Metabolomic Analysis Identifies Potential Biomarkers and Pathogenic Pathways in Kidney Cancer. OMICS: A Journal of Integrative Biology, 15(5), 293–303. [CrossRef]
- Tuncer, S. A., & Alkan, A. (2018). A decision support system for detection of the renal cell cancer in the kidney. Measurement. 2018, 123, 298–303. [CrossRef]
- Hadjiyski, N. (2020). Kidney Cancer Staging: Deep Learning Neural Network Based Approach. 2020 International Conference on e-Health and Bioengineering (EHB). [CrossRef]
- Da Cruz, L. B. , Araújo, J. D. L., Ferreira, J. L., Diniz, J. O. B., Silva, A. C., de Almeida, J. D. S., Gattass, M. (2020). Kidney segmentation from computed tomography images using deep neural network. Computers in Biology and Medicine, 103906. [CrossRef]
- Kim, D.-Y. , & Park, J.-W. (2004). Computer-aided detection of kidney tumor on abdominal computed tomography scans. Acta Radiologica, 45(7), 791–795. [CrossRef]
- Zhang, J., Lefkowitz, R. A., & Bach, A. (2007). Imaging of Kidney Cancer. Radiologic Clinics of North America 2007, 45(1), 119–147. [CrossRef]
- Asha, V. , Sreeja, S. P., Prasad, A., Kathari, G. V., Karthik, A. S., Kathyayini, (2024). Classification of Images Related to Kidney Cancer using Hybrid Deep Learning. 2024 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE). Bangalore, India. 1-6. [CrossRef]
- Uhm, KH. , Jung, SW., Choi, M.H. et al. (2021). Deep learning for end-to-end kidney cancer diagnosis on multi-phase abdominal computed tomography. Npj Precision Oncology, 5(1). [CrossRef]
- Türk, F. Türk, F., Lüy, M., & Barışçı, N. Machine Learning of Kidney Tumors and Diagnosis and Classification by Deep Learning Methods. International Journal of Engineering Research and Development 2019, 11(3), 802–812. [Google Scholar] [CrossRef]
- Basandrai, A. et al. (2023). Medical Scan Classification Dataset (online) – Available at https://www.kaggle.com/datasets/arjunbasandrai/medical-scan-classification-dataset (accessed March 2023).
- Center for Development of Advanced Computing. PARAM Utkarsh Supercomputer. https://paramutkarsh.cdac.in/.




| Class Name | Precision | 1-Precision | Recall | 1-Recall | f1-score |
|---|---|---|---|---|---|
| Normal | 0.997 | 0.003 | 0.999 | 0.001 | 0.998 |
| Cancerous | 0.999 | 0.001 | 0.997 | 0.003 | 0.998 |
| Accuracy | 0.998 | ||||
| Miss Rate | 0.002 | ||||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).