Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Improving Clinical Prediction of Later Occurrence of Breast Cancer Metastasis Using Deep Learning and Machine Learning with Grid Search

Version 1 : Received: 23 June 2022 / Approved: 29 June 2022 / Online: 29 June 2022 (04:06:16 CEST)

How to cite: Jiang, X.; Xu, C. Improving Clinical Prediction of Later Occurrence of Breast Cancer Metastasis Using Deep Learning and Machine Learning with Grid Search. Preprints 2022, 2022060394 (doi: 10.20944/preprints202206.0394.v1). Jiang, X.; Xu, C. Improving Clinical Prediction of Later Occurrence of Breast Cancer Metastasis Using Deep Learning and Machine Learning with Grid Search. Preprints 2022, 2022060394 (doi: 10.20944/preprints202206.0394.v1).

Abstract

ABSTRACT Background It is important to be able to predict, for each individual patient, the likelihood of later metastatic occurrence, because the prediction can guide treatment plans tailored to a specific patient to prevent metastasis and to help avoid under- or over-treatment. Deep Neural Network (DNN) learning, commonly referred to as deep learning, has become popular due to its success in image detection and prediction, but questions such as whether deep learning outperforms other machine learning methods when using non-image clinical data remain unanswered. Grid search has been introduced to deep learning hyperparameter tunning for the purpose of improving its prediction performance, but the effect of grid search on other machine learning methods are under-studied. In this research, we take the empirical approach to study the performance of deep learning and other machine learning methods when using non-image clinical data to predict the occurrence of breast cancer metastasis (BCM) 5, 10, or 15-years after the initial treatment. We developed DNN models as well as models using 9 other machine learning methods including Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), LASSO, Decision Tree (DT), k-Nearest Neighbors (KNN), Random Forrest (RF), AdaBoost (ADB), and XGBoost (XGB). We used grid search to tune hyperparameters for all methods. We then compared the deep learning models to the models trained using the 9 other machine learning methods. Results Based on the mean test AUC results, DNN ranks 6th, 4th, and 3rd when predicting 5-year, 10-year, and 15-year BCM respectively, out of 10 machine learning methods. The top performing methods in predicting 5-year BCM are XGB(1st), RF(2nd), and KNN(3rd). For predicting 10-year BCM the top performers are XGB (1st), RF(2nd), and NB(3rd) . Finally, for 15-year BCM the top performers are SVM (1st), LR and LASSO (tied for 2nd), and DNN (3rd). The ensemble methods RF and XGB outperform other methods when data are less balanced, while SVM, LR, LASSO, and DNN outperform other methods when data are more balanced. Our statistical testing results show that at a significance level of 0.05 DNN overall performs no worse than other machine learning methods when predicting 5-year, 10-year, and 15-year BCM. Conclusions Our results show that deep learning with grid search overall performs at least as well as other machine learning methods when using non-image clinical data. It is interesting to note that some of the other machine learning methods such as XGB, RF, and SVM are very strong competitors of DNN when incorporating grid search. It is also worth noting that the computation time required to do grid search with DNN is way more than that required to do grid search with the other 9 machine learning methods.

Keywords

deep learning; DNN; machine learning; breast cancer; metastasis, metastatic breast cancer, distant recurrence of breast cancer metastasis; prediction; clinical; EHR

Subject

MEDICINE & PHARMACOLOGY, Oncology & Oncogenics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.

We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.