ARTICLE | doi:10.20944/preprints202105.0670.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: melanoma; biomarker; transfer learning; ensemble model; bias; machine learning
Online: 27 May 2021 (13:20:55 CEST)
Melanoma is considered the most serious and aggressive type of skin cancer, and metastasis appears to be the most important factor in prognosticating this type of cancer. With the emergence of new therapeutic strategies for metastatic melanoma that have shown improvement in patient survival, we developed a transfer learning-based biomarker discovery model that could help in the diagnosis and prognosis of this disease. After applying it to the ensemble machine learning model, results reveal that the genes we found show consistency with other methodologies previously applied to the same TCGA (The Cancer Genome Atlas) data set, and our methods found novel biomarker genes as well. Our ensemble model achieved Area Under the Receiver Operating Characteristic (AUC) of 0.9861, an accuracy of 91.05, and an F1 score of 90.60 using an independent validation data set. This study was able to identify potential genes for diagnostic classification (C7 and GRIK5) and diagnostic and prognostic biomarkers (S100A7, S100A7, KRT14, KRT17, KRT6B, KRTDAP, SERPINB4, TSHR, PVRL4, WFDC5, IL20RB). We also assessed the potential sources of bias for our model and confirmed some of them by the model's performance.