Preprint
Review

Machine Learning of Raman Spectroscopy Data for Classifying Cancers: a Review of the Recent Literature

Submitted:

29 April 2022

Posted:

13 May 2022

You are already at the latest version

A peer-reviewed article of this preprint also exists.

Abstract
Raman Spectroscopy has long been anticipated to augment clinical decision making, such as classifying oncological samples. Unfortunately, the complexity of Raman data has thus far inhibited its routine use in clinical settings. Traditional machine learning models have been used to help exploit this information, but recent advances in deep learning have the potential to improve the field. However, there are a number of potential pitfalls with both traditional and deep learning models. We conduct a literature review to ascertain the recent machine learning methods used to classify cancers using Raman spectral data. We find that while deep learning models are popular, and ostensibly outperform traditional learning models, there are many methodological considerations which may be leading to an over-estimation of performance: primarily, small sample sizes which compound upon sub-optimal choices regarding sampling and validation strategies. Amongst several recommendations is a call to collate large benchmark Raman datasets, similar to those that have helped transform digital pathology, which researchers can use to develop and refine deep learning models.
Keywords: 
;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Downloads

356

Views

268

Comments

0

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Email

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated