Interpretable Spectral Transformer for Raman-Based Bacterial Identification Across Species and Strains

Yijian Meng; Jesper B. Christensen; Carsten Thirstrup; Lucia Ronda Rute; Konstantinos Stergiou; Danylo Komisar; Oleksii Ilchenko; Ditte Rask Tornby; Thomas Emil Andersen; Hüsnü Aslan; Mikael Lassen

doi:10.20944/preprints202606.0751.v1

Submitted:

08 June 2026

Posted:

09 June 2026

You are already at the latest version

Abstract

Raman spectroscopy combined with machine learning offers a rapid, label-free approach for bacterial identification, but robust translation remains challenged by spectral variability, biological heterogeneity, and limited model interpretability. Here, we present an integrated evaluation of an optimized Spectral Transformer (ST) framework for Raman-based bacterial classification benchmarked against a systematically optimized one-dimensional convolutional neural network (1D-CNN). The comparison was performed using a curated 36-class dataset comprising 15 Gram-negative bacterial entries, 15 Gram-positive bacterial entries, one non-bacterial microorganism, and five background/reference classes, enabling evaluation of both species-level and fine-grained bacterial classification. Under 15 dB noise-augmented evaluation, the ST achieved 80.6% ± 0.3% accuracy and a Matthews correlation coefficient (MCC) of 0.801 ± 0.003, outperforming the 1D-CNN baseline with 72.9% ± 0.3% accuracy andanMCCof0.721±0.003. Integrated Gradients analysis combined with attention map visualization enabled multi-level model interpretation, revealing that the ST’s improved robustness correlates with more bounded attribution patterns during misclassification, whereas the 1D-CNN’s feature attribution becomes scattered under noise perturbation. Importantly, this interpretability-driven analysis identified model-specific failure modes in the baseline architecture, including an over-reliance on non-specific spectral regions under noise, which can inform future data collection strategies and guide refinements to experimental protocols. These results demonstrate that attention-based spectral modeling improves Raman-based bacterial classification under noise-perturbed conditions while enabling multi-level interpretability that bridges model understanding with actionable feedback on experimental design and data quality requirements.

Keywords:

raman spectroscopy

;

bacterial identification

;

spectral transformer

;

machine learning

;

deep learning

;

chemometrics

;

chemical fingerprinting

;

integrated gradients

;

attention maps

;

model interpretability

;

1d-cnn

;

noise robustness

;

microbial classification

;

spectral preprocessing

;

raman-based diagnostics

Subject:

Physical Sciences - Applied Physics

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Interpretable Spectral Transformer for Raman-Based Bacterial Identification Across Species and Strains

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe