ARTICLE | doi:10.20944/preprints202002.0113.v1
Subject: Chemistry, Other Keywords: Blind Source Separation; Component Analysis; Chemometrics; Unsupervised Machine Learning; Endmember Extraction; Spectral Unmixing; NMR
Online: 9 February 2020 (17:18:38 CET)
NMR spectral datasets, especially in systems with limited samples, can be difficult to interpret if they contain multiple chemical components (phases, polymorphs, molecules, crystals, glasses, etc…) and the possibility of overlapping resonances. In this paper, we benchmark several blind source separation techniques for analysis of NMR spectral datasets containing negative intensity. For benchmarking purposes, we generated a large synthetic datasbase of quadrupolar solid-state NMR-like spectra that model spin-lattice T1 relaxation or nutation tip/flip angle experiments. Our benchmarking approach focused exclusively on the ability of blind source separation techniques to reproduce the spectra of the underlying pure components. In general, we find that FastICA (Fast Independent Component Analysis), SIMPLISMA (SIMPLe-to-use-Interactive Self-modeling Mixture Analysis), and NNMF (Non-Negative Matrix Factorization) are top-performing techniques. We demonstrate that dataset normalization approaches prior to blind source separation do not considerably improve outcomes. Within the range of noise levels studied, we did not find drastic changes to the ranking of techniques. The accuracy of FastICA and SIMPLISMA degrades quickly if excess (unreal) pure components are predicted. Our results indicate poor performance of SVD (Singular Value Decomposition) methods, and we propose alternative techniques for matrix initialization. The benchmarked techniques are also applied to real solid state NMR datasets. In general, the recommendations from the synthetic datasets agree with the recommendations and results from the real data analysis. The discussion provides some additional recommendations for spectroscopists applying blind source separation to NMR datasets, and for future benchmark studies. Applications of blind source separation to NMR datasets containing negative intensity may be especially useful for understanding complex and disordered systems with limited samples and mixtures of chemical components.