Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

ML Classification of Cancer Types Using High Dimensional Gene Expression Microarray Data

Version 1 : Received: 29 January 2024 / Approved: 30 January 2024 / Online: 31 January 2024 (01:49:19 CET)
Version 2 : Received: 17 April 2024 / Approved: 18 April 2024 / Online: 18 April 2024 (09:55:50 CEST)

How to cite: Mukhopadhyay, D.; Phanord, D.D.; Dalpatadu, R.J.; Gewali, L.P.; Singh, A.K. ML Classification of Cancer Types Using High Dimensional Gene Expression Microarray Data. Preprints 2024, 2024012067. https://doi.org/10.20944/preprints202401.2067.v1 Mukhopadhyay, D.; Phanord, D.D.; Dalpatadu, R.J.; Gewali, L.P.; Singh, A.K. ML Classification of Cancer Types Using High Dimensional Gene Expression Microarray Data. Preprints 2024, 2024012067. https://doi.org/10.20944/preprints202401.2067.v1

Abstract

Machine Learning classifiers are used to classify a very wide dataset containing gene ex-pression microarray data of patients with five types of cancer (breast cancer, kidney cancer, Colon cancer, lung cancer and prostate cancer). Since the dataset was very wide with a large number of columns, the code yielded stack overflow errors, and we resorted to Principal Components Analysis (PCA) for dimensionality reduction, and principal component scores of the raw data for classification. PCA was run using a fast algorithm which is able to compute PC scores for very large datasets. High classification accuracies are obtained using just the first two principal component scores. Machine Learning (ML) classifiers Linear Discriminant Analysis (LDA) & Random Forest (RF) methods were utilized where the latter provided with higher accuracy than the former. The results of this article should be helpful to researchers who are dealing with large number of genes in microarray data.

Keywords

Linear Discriminant Analysis; Random Forest; Precision; Recall; F1; AUC; macro-averaged AUC; micro-averaged AUC

Subject

Biology and Life Sciences, Other

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.