Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Machine-Learning to Discern Interactive Clusters of Risk Factors for Late Recurrence of Metastatic Breast Cancer

Version 1 : Received: 29 November 2021 / Approved: 1 December 2021 / Online: 1 December 2021 (13:40:33 CET)

A peer-reviewed article of this Preprint also exists.

Gomez Marti, J.L.; Brufsky, A.; Wells, A.; Jiang, X. Machine Learning to Discern Interactive Clusters of Risk Factors for Late Recurrence of Metastatic Breast Cancer. Cancers 2022, 14, 253. Gomez Marti, J.L.; Brufsky, A.; Wells, A.; Jiang, X. Machine Learning to Discern Interactive Clusters of Risk Factors for Late Recurrence of Metastatic Breast Cancer. Cancers 2022, 14, 253.

Journal reference: Cancers 2022, 14, 253
DOI: 10.3390/cancers14010253

Abstract

Background: Risk of metastatic recurrence of breast cancer after initial diagnosis and treatment depends on the presence of a number of risk factors. Although most univariate risk factors have been identified using classical methods, machine-learning methods are also being conducted to tease out non-obvious contributors to a patient’s individual risk of developing late distant metastasis. Bayesian-network algorithms may predict not only risk factors but also interactions among these risks, which consequently lead to metastatic breast cancer. We proposed to apply a previously developed machine-learning method to predict risk factors of 5-, 10- and 15-year metastasis. Methods: We applied a previously validated algorithm named the Markov Blanket and Interactive risk factor Learner (MBIL) on the electronic health record (EHR)-based Lynn Sage database (LSDB) from the Lynn Sage Comprehensive Breast Cancer at Northwestern Memorial Hospital. This algorithm provided an output of both single and interactive risk factors of 5-, 10-, and 15-year metastasis from LSDB. We individually examined and interpreted the clinical relevance of these interactions based on years to metastasis and the reliance on interactivity between risk factors. Results: We found that with lower alpha values (low interactivity score), the prevalence of variables with an independent influence on long term metastasis was higher (i.e., HER2, TNEG). As the value of alpha increased to 480, stronger interactions were needed to define clusters of factors that increased the risk of metastasis (i.e., ER, smoking, race, alcohol usage). Conclusion: MBIL identified single and interacting risk factors of metastatic breast cancer, many of which were supported by clinical evidence. These results strongly recommend the development of further large data studies with different databases to validate the degree to which some of these variables impact metastatic breast cancer in the long term.

Keywords

metastatic breast cancer; metastasis; causal learning; machine learning

Subject

MEDICINE & PHARMACOLOGY, Oncology & Oncogenics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.