Submitted:
30 August 2024
Posted:
03 September 2024
You are already at the latest version
Abstract

Keywords:
1. Introduction
1.1. Motivation
1.2. Problem Statement
1.3. A Practical Approach to Harmonic Signal Processing
- A discrete-time version of the signal represented by Equation (1) is first obtained using a convenient sampling frequency (), i.e.,where represents sampling period and corresponds to the reciprocal of the sampling frequency ().
- A T-F transformation (using e.g., the DFT) is computed on a windowed region of the discrete-time signal containing N samples. We assume that N is a power of 2 number. If the window is represented by (and that we assume is symmetric), this means that, in the case of the DFT, we compute
- Finally, a suitable estimation procedure is used that takes as input the spectral coefficients, , and delivers robust estimates of all the harmonic frequencies, magnitudes, and phases.
1.4. Paper Structure
2. Robust DFT-Based Phase Estimation of Individual Sinusoids
2.1. DFT and the Rectangular window
2.2. DFT and the Sine and shifted Hanning Windows
2.3. ODFT and the Rectangular window
2.4. ODFT and the Sine and shifted Hanning Windows
2.5. Bias and Variance of the Phase Estimation Error
3. An Interpretable Time-Shift Invariant Harmonic Phase Model
3.1. NRD Estimation Example Based on the Sawtooth Wave
3.2. NRD estimation examples based on the sawtooth wave derivative
3.3. NRD Estimation Examples Using Natural Voice Signals
3.4. Demystifying the Phasegram
4. Conclusion
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
| CRLB | Cramér-Rao Lower Bound |
| DFT | Discrete Fourier Transform |
| ECG | Electrocardiogram |
| FFT | Fast Fourier Transform |
| FM | Frequency modulation |
| HPM | Harmonic Phase Model |
| MM | Magnitude Model |
| NRD | Normalized Relative Delay |
| ODFT | Odd-frequency Discrete Fourier Transform |
| PM | Phase Model |
| SNR | Signal-to-Noise Ratio |
| T-F | Time-to-Frequency |
References
- Oppenheim, A.V.; Lim, J.S. The importance of phase in signals. Proceedings of the IEEE 1981, 69, 529–541. [Google Scholar] [CrossRef]
- Bonada, J.; Serra, X.; Amatriain, X.; Loscos, A. Spectral processing. In DAFX: Digital Audio Effects; Zölzer, U., Ed.; John Wiley & Sons Ltd, 2011; chapter 10, pp. 393–445.
- Quatieri, T.F.; McAulay, R.J. Audio Signal Processing Based on Sinusoidal Analysis/Synthesis. In Applications of Digital Signal Processing to Audio and Acoustics; Kahrs, M.; Brandenburg, K., Eds.; Kluwer Academic Publishers, 2002; chapter 9, pp. 343–416.
- Silva, J.M.; Oliveira, M.A.; Saraiva, A.F.; Ferreira, A.J.S. One-Step Discrete Fourier Transform-Based Sinusoid Frequency Estimation under Full-Bandwidth Quasi-Harmonic Interference. Acoustics 2023, 5, 845–869. [Google Scholar] [CrossRef]
- Jacobsen, E.; Kootsookos, P. Fast, Accurate Frequency estimators. IEEE Signal Processing Magazine 2007, pp. 123–125.
- Schoukens, J.; Pintelon, R.; Hamme, H.V. The Interpolated Fast Fourier Transform: a comparative study. IEEE Transactions on Instrumentation and Measurement 1992, 41, 226–232. [Google Scholar] [CrossRef]
- Keiler, F.; Marchand, S. Survey on extraction of sinusoids in stationary sounds. In Proceedings of the 5th Int. Conf. on Digital Audio Effects (DAFx-02); 2002; pp. 51–58. [Google Scholar]
- Ferreira, A.J.S. Accurate Estimation in the ODFT Domain of the Frequency, Phase and Magnitude of Stationary Sinusoids. In Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 21-24 2001; pp. 47–50. [Google Scholar]
- Ferreira, A.J.S.; Sousa, R. DFT-based frequency estimation under harmonic interference. In Proceedings of the 4th International Symposium on Communications, Control and Signal Processing, March 2010.
- Rife, D.C.; Boorstyn, R.R. Single-Tone Parameter Estimation from Discrete-Time Observations. IEEE Transactions on Information Theory 1974, 20, 591–598. [Google Scholar] [CrossRef]
- Quinn, B.G. Estimation of Frequency, Amplitude, and Phase from the DFT of a Time Series. IEEE Transactions on Signal Processing 1997, 45, 814–817. [Google Scholar] [CrossRef]
- Oppenheim, A.V.; Willsky, A.S.; Hamid, S. Signals and Systems; Pearson Education Limited, 1996. 2nd Ed.
- Oppenheim, A.V.; Schafer, R.W. Discrete-Time Signal Processing; Pearson Higher Education, Inc., 2010.
- Laroche, J.; Dolson, M. Phase-vocoder: about this phasiness business. In Proceedings of the Workshop on Applications of Signal Processing to Audio and Acoustics; 1997. [Google Scholar]
- Liguori, C.; Paolillo, A.; Pignotti, A. Estimation of Signal Parameters in the Frequency Domain in the Presence of Harmonic Interference: A Comparative Analysis. IEEE Transactions on Instrumentation and Measurement 2006, 55, 562–569. [Google Scholar] [CrossRef]
- Vaidyanathan, P.P. Multirate Systems and Filter Banks; Prentice-Hall, 1993.
- Malvar, H. Signal Processing with Lapped Transforms; Artech House, Inc., 1992.
- Painter, T.; Spanias, A. Perceptual Coding of Digital Audio. Proceedings of the IEEE 2000, 88, 451–513. [Google Scholar] [CrossRef]
- Bellanger, M. Digital Processing of Signals; John Willey & Sons, 1989.
- Ferreira, A.J.; Tribolet, J.M. A holistic glottal phase related feature. In Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), 2018. Aveiro, Portugal.
- M.Kay, S. Fundamentals of Statistical Signal Processing Estimation Theory; Prentice Hall, Inc., 1993.
- Sousa, R.; Ferreira, A. Importance of the Relative Delay of Glottal Source Harmonics. In Proceedings of the 39th AES International Conference on Audio Forensics - practices and challenges; 2010; pp. 59–69. [Google Scholar]
- Stylianou, I. Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification. PhD thesis, École Nationale Supérieure des Télécommunications, France, 1996.
- Federico, R.D. Waveform Preserving Time Stretching and Pitch Shifting for Sinusoidal Models of Sound. In Proceedings of the COST-G6 Digital Audio Effects Workshop; 1998; pp. 44–48. [Google Scholar]
- Saratxaga, I.; Hernaez, I.; Erro, D.; Navas, E.; Sanchez, J. Simple representation of signal phase for harmonic speech models. Electronic Letters 2009, 45. [Google Scholar] [CrossRef]
- Ferreira, A.; Oliveira, M.; Santos, V. On the mismatch between the phase structure of all-pole-based synthetic vowels and natural vowels. In Proceedings of the IEEE Workshop on Signal Processing Systems (SiPS); 2024; pp. 1–6. [Google Scholar]
- Ferreira, A.; Silva, J.; Brito, F.; Sinha, D. Impact of a shift-invariant harmonic phase model in fully parametric harmonic voice representation and time/frequency synthesis. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2020.
- Ferreira, A.; Sinha, D. Advances to a Frequency-Domain Parametric Coder of Wideband Speech. 140th Convention of the Audio Engineering Society 2016. Paper 9509.
- Ferreira, A. Phonetic-oriented identification of twin speakers using 4-second vowel sounds and a combination of a shift-invariant phase feature (NRD), MFCCs and F0 information. In Proceedings of the 2019 AES Int. Conference on Audio Forensics, 2019.
- Ferreira, A. On the Physiological Validity of the Group Delay Response of All-Pole Vocal Tract Modeling. 145th Convention of the Audio Engineering Society 2018. Paper 10038.
- Quatieri, T.F.; McAulay, R.J. Phase Coherence in Speech Reconstruction for Enhancement and Coding Applications. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing; 1989; pp. 207–210. [Google Scholar]


























Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).