Submitted:
01 September 2025
Posted:
03 September 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Dataset and Preprocessing
2.2. Feature Extraction
Pitch deviation (cents)
Jitter
Shimmer
Loudness (LUFS) and RMS energy
Tone-to-Noise Ratio (TNR)
MFCCs
Zero-Crossing Rate (ZCR)
Spectral centroid
Spectral bandwidth
Spectral flatness
Formants F1–F3
Vibrato extent and rate
2.3. Angle Scaling
| Feature | Rotation Group | Min | Max |
|---|---|---|---|
| Average pitch deviation (cents) | (pitch stability) | 0 | 1431.7 |
| Average jitter | (pitch stability) | 0 | 0.3278 |
| Std. dev. tempo (BPM) | (rhythm) | 30 | 180 |
| Average shimmer | (dynamics) | 0 | 1.1735 |
| Mean LUFS energy (dB) | (dynamics) | ||
| Std. dev. LUFS energy | (expression) | 1 | 12 |
| Std. dev. MFCC (timbre) | (timbre) | 0 | 0.25 |
| Zero-crossing rate | (clarity) | 0.01 | 0.12 |
| Mean tone-to-noise ratio (TNR, dB) | (clarity) | 5 | 30 |
2.4. Quantum Circuit Architecture
2.5. Hybrid Neural Network
2.6. Comparison Metrics and Environment
2.7. Ethics and Reproducibility
3. Results and Discussion
3.1. Improvement over Classical Methods
3.2. Student vs. Master Comparison


4. Conclusion
References
- Ghisingh, S.; Sharma, S.; Mittal, V.K. Acoustic analysis of Indian classical music using signal processing methods. In Proceedings of the Proc. IEEE Region 10 Conference (TENCON); 2017. [Google Scholar] [CrossRef]
- Liu, A.Y.; Wallmark, Z. Identifying Peking Opera Roles Through Vocal Timbre: An Acoustical and Conceptual Comparison Between Dan and Laosheng. Music & Science 2024, 7. [Google Scholar] [CrossRef]
- Hashem, A.; Arif, M.; Alghamdi, M. Speech emotion recognition approaches: A systematic review. Speech Communication 2023, 154, 102974. [Google Scholar] [CrossRef]
- Kashani, S.; Alqasemi, M.; Hammond, J. A quantum Fourier transform (QFT) based note detection algorithm. arXiv 2022. [Google Scholar] [CrossRef]
- Miranda, E.R.; Yeung, R.; Pearson, A.; Meichanetzidis, K.; Coecke, B. A quantum natural language processing approach to musical intelligence. arXiv 2021. [Google Scholar] [CrossRef]
- Gündüz, G. Entropy, Energy, and Instability in Music. Physica A: Statistical Mechanics and its Applications 2023, 609, 128365. [Google Scholar] [CrossRef]
- Gong, Y.; Chung, Y.A.; Glass, J. AST: Audio Spectrogram Transformer. In Proceedings of the Proceedings of Interspeech. ISCA; 2021. [Google Scholar]



Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).