Preprint
Article

This version is not peer-reviewed.

Individual Passaggio Identification Based on Laryngeal Surface Vibration Ratios Measured by Laser Doppler Vibrometer

Submitted:

12 February 2026

Posted:

13 February 2026

You are already at the latest version

Abstract
Passaggio is a natural physiological phenomenon during vocal register transitions in singing, with its pitch location varying across individuals. Conventional identification methods rely on auditory judgment or voice type classification, which are inaccurate due to individual differences. In this study, a laser doppler vibrometer (LDV) and an acoustic microphone set were used to synchronously measure laryngeal surface vibration and singing voice,in order to systematically investigate singing passaggio behavior. The data indicate a stable fundamental frequency correspondence between the laryngeal vibration signal and the acoustic signal, which supports the use of amplitude ratios of low-order harmonic peaks in the laryngeal vibration spectrum as relative indicators of structural changes in laryngeal vibration. The result shows that male and female singers exhibit distinct patterns of structural change in laryngeal vibration during passaggio, while consistent patterns are observed within the same sex. For individuals, clear structural transitions in laryngeal vibration are observed at the pitch of passaggio, providing a basis for accurate identification of individual singing passaggio.
Keywords: 
;  ;  ;  
Subject: 
Physical Sciences  -   Acoustics

1. Introduction

critical for singing development. As noted by Reda Elbarougy (2019), the proficiency of singers is measured by how smoothly they cross from one register to another, the more smoothly transition is the best the singer is[1]. However, identification of the passaggio in vocal register transitions has traditionally relied on teachers’ experience, which lacks objectivity and often results in limited accuracy. Sometimes, a singer has tenor folds but it within a baritone body, which makes challenge of voice teacher to identify the passaggio of this type of singers. To address this challenge, there is a clear need for an objective method capable of accurately identifying individual passaggio, providing singers and vocal educators with a reliable and explicit basis for individual passaggio determination.
Previous studies have demonstrated that vocal fold vibration patterns are different across vocal registers[2,3,4,5], and it is widely acknowledged that mixed voice mechanisms play a particularly important role in vocal register transitions within laryngeal functional control[6,7,8]. Research by Matthias Echternach has shown that vocal fold vibration patterns undergo obviously changes during register transitions, suggesting two possible explanations. First, the register transition may be driven by changes in the biomechanical properties of the larynx at the sound source level, involving laryngeal muscular activity. Second, resonation of the vocal tract may play a strong effect on the sound source. Echternach emphasized that further investigation is required to determine which of these mechanisms predominates during vocal register transitions[9].
From the two possibilities above, it can be inferred that changes in vocal fold vibration patterns observed during vocal register transitions cannot be attributed to a single mechanism. Both biomechanical adjustments of the larynx at the sound source level and filtering effects arising from changes in vocal tract resonation conditions may coexist and interact during singing. This mechanistic uncertainty poses a challenge for precisely determining the underlying causes of vocal register transitions at the present stage.
However, from an applied acoustics perspective, vocal register transition can be regarded as an event that is observable and locatable across different pitch levels. In other words, even if the dominant physiological mechanisms underlying vocal register transition have not yet been definitively established, objective identification remains feasibility if sound source related vibratory or acoustic signals exhibit stable structural changes during the transition. Thus, finding a method to observe the structure of sound source vibratory is first step of this study.
Based on existing studies, observations of the singing sound source have primarily relied on methods such as laryngoscope, Magnetic Resonance Imaging (MRI), electroglottography (EGG), and acoustic audio recording.
Among these approaches, laryngoscopy is considered the most direct method for investigating the laryngeal sound source. A variety of techniques are currently available, with the most used including video stroboscope , videokymography, and high-speed video (HSV) endoscopy[10,11,12,13]. However, these techniques are invasive, and only approximately 90–95% of individuals can tolerate such procedures, implying that the measurement process may interfere with singing behavior. Although previous studies have shown that advanced stroboscopic techniques provide relatively reliable assessments of static vocal fold structures, their reliability in evaluating dynamic features, such as vocal fold vibration amplitude, remains limited.
Compared with laryngoscopy, Magnetic Resonance Imaging (MRI) offers the advantage of non-invasiveness; however, it presents methodological limitations in the dynamic observation of phonation. Conventional MRI has difficulty achieving stable imaging of rapid periodic motions during voice production, as fast movements introduce pronounced motion artifacts and result in image blurring. For high-speed phenomena such as vocal fold vibration, traditional MRI is constrained by imaging speed, making direct capture of dynamic vibratory processes challenging.
Although real-time MRI (rtMRI) has emerged in recent years and has been applied to continuous observation of vocal tract configurations during speech and singing tasks, its implementation typically involves balancing of temporal resolution, spatial resolution, and signal-to-noise ratio. Research-oriented rtMRI often achieve higher frame rates at the expense of spatial detail, making them more suitable for capturing macroscopic configurational trends rather than localizing sound source–related events that occur on shorter time scales[14]. In addition, while rtMRI has been demonstrated to be feasible for investigating vocal tract dynamics and has led to recommended parameter frameworks[15,16], its acquisition platforms remain predominantly medical in nature. As a result, rtMRI is difficult to translate into a practical tool for routine identification of passaggio under natural singing conditions.
In contrast to imaging-based techniques, electroglottography (EGG) is widely regarded as a non-invasive measurement method closely related to sound source activity[17,18,19]. By recording temporal variations in vocal fold contact area, EGG can reflect differences in glottal contact patterns across phonatory conditions[20]. Consequently, EGG has been extensively used in speech and singing research to investigate vocal register characteristics, modes of glottal closure, and phonatory efficiency.
However, from the perspective of passaggio identification, the application of EGG also presents inherent limitations. During passaggio, the vocal folds and the ventricular folds (false vocal folds) operate in a coupled mode[21], with vocal fold vibration and ventricular fold vibration oscillating at the same frequency but in opposite phases. This out of phase oscillation behavior is considered essential for maintaining periodic laryngeal vibration[22]. Consequently, passaggio affects the overall vibratory pattern of the larynx. However, the single-source observation provided by EGG, which primarily reflects vocal fold contact behavior, is insufficient to capture the integrated laryngeal vibration state involving multiple interacting vibratory structures.
A large body of research on singing behavior has relied on conventional acoustic audio recordings. However, according to Fant’s source–filter theory, vocal tract strongly influences the acoustic output and consequently affects experimental outcomes[23]. As a result, acoustic signals are more appropriately treated as reference data rather than as objective criteria for passaggio identification.
Based on the considerations outlined above, this study primarily employs non-contact laser Doppler vibrometer (LDV) to acquire laryngeal vibration signals. This technique has been widely used in engineering fields involving non-contact vibration and acoustic measurements and has also been applied in few biomedical contexts[24,25]. However, applications of laser Doppler vibrometer (LDV) in phonation and singing research remain scarce. Few studies have applied LDV to speech-related measurements[26] and to the assessment of facial vibrations during singing[27]. However, LDV has not been explicitly employed to investigate laryngeal sound source–related vibrations in singing contexts. Notably, no previous study has systematically compared laryngeal surface vibration velocity spectrum obtained by LDV with same time acquired acoustic spectrum during singing. As a result, the relationship between sound source–side vibratory responses and radiated acoustic output has yet to be examined at the level of vibration structure.
Accordingly, this study combines laser Doppler vibrometer (LDV) with 1/2-inch pre-polarized free-field microphone (B&K Type 4189) to synchronously acquire laryngeal surface vibration signals and singing acoustic signals while singers perform at different pitch levels in an anechoic room. This approach is motivated by the view that vocal register transition is from changes in laryngeal biomechanical properties at the sound source level, including the involvement of laryngeal musculature10. From this perspective, the larynx can be regarded as an integrated vibratory system composed of the vocal folds, cartilaginous structures, and surrounding soft tissues, whose vibration reflects structural characteristics of the sound source. Thus, this study does not aim to distinguish specific physiological driving mechanisms underlying passaggio. Instead, passaggio is treated as an event reflected as a transition in vibratory patterns along ascending pitch, and identification of the passaggio is achieved through analysis of structural characteristics in laryngeal surface vibration.
Meanwhile, from an experimental perspective, LDV-based measurement of laryngeal surface vibration offers clear advantages in terms of being non-invasive and non-contact, without relying on complex medical imaging platforms or constraints on phonatory posture. Participants are only required to perform brief singing tasks in a natural standing position to obtain continuous and stable vibratory and acoustic data. This high level of operational feasibility indicates strong potential for broader application, particularly in passaggio research oriented toward practical singing and vocal pedagogy area.
At the methodological level, this study addresses several existing research gaps. First, laryngeal surface vibration signals and singing acoustic signals are first time synchronously acquired under singing conditions. Second, focusing on laryngeal surface vibration, this study proposes a research framework based on vibration structural characteristics. Third, by comparing laryngeal surface vibration features across sex and voice type groups, this study provides new experimental evidence for understanding passaggio-related vibratory patterns in different sex phonatory systems. Overall, these methodological designs establish a passaggio identification approach that is operationally feasible under natural singing conditions and supports cross-individual comparison.

2. Materials and Methods

2.1. Experimental Setup

A schematic of the experimental setup is shown in Figure 1. The input devices included a laser Doppler vibrometer (Polytec PSV-500 scanning head), a vibrometer front-end (Polytec PSV-500 FRONT-END), a vibrometer data management system (Polytec DMS), and a 1/2-inch pre-polarized free-field microphone (B&K Type 4189). The voice signal and the laryngeal vibration signal were transmitted to an input module (B&K Type 3050-A-060), which provided sound pressure and velocity data that were subsequently recorded and analyzed using a computer equipped with B&K Connect software.
The voice signal was recorded using a 1/2-inch pre-polarized free-field microphone at a sampling rate of 47 kHz. Prior to the experiment, the microphone was calibrated using a standard sound source at 1 kHz and 94 dB to ensure the accuracy and consistency of sound pressure measurements. The microphone was positioned at 1.5 m from the participant and at a height of 1.7 m to maintain free-field conditions and to minimize the influence of near-field effects and turbulence noise on voice signal.
Laryngeal vibration signal was recorded using a laser Doppler vibrometer (LDV). The LDV is an optical sensor that directs a laser beam onto a vibrating surface and measures the frequency shift of the laser light reflected from the surface. Based on the Doppler effect, the LDV determines the vibration velocity and displacement at arbitrary points on the surface without physically contacting or disturbing the vibrating object.
LDV scanning head was positioned at 2 m from the participant, with the velocity measurement range set to 100 mm/s. Preliminary pilot experiments confirmed that, within this range, the measured velocity amplitudes were consistent with the actual vibration velocities of the skin surface. A single-point measurement mode was employed, with the laser beam directed onto the external skin surface over the thyroid cartilage of the larynx. To satisfy the surface reflectivity requirements of the LDV, a soft reflective marker with a diameter of approximately 2 cm was attached to the skin over the thyroid cartilage and designated as the laser measurement area. Moreover, the soft reflective marker deforms with the skin, thereby reducing resonance noise that could arise from normal marker vibration and improving the signal-to-noise ratio of the laryngeal vibration measurements.

2.2. Participants

Table 1 summarizes the characteristics of the 20 participants recruited in this study. Participants were evenly distributed across four voice types, with five participants in each group: baritone, tenor, mezzo-soprano, and soprano. Although the overall age of the cohort was relatively young (mean ± SD: 24.55 ± 4.74 years), each voice-type group included singers in their late twenties to mid-thirties, ensuring that all four voice categories were represented by physiologically mature singers. All participants were professionally trained singers with a minimum of three years of formal vocal training (mean ± SD: 8.05 ± 4.79 years). This training years background ensures that all participants could sing at stable pitch targets and accomplish experimental tasks, thereby enhancing the reliability of the measured laryngeal vibration and acoustic signals.
All participants reported good general health prior to the experiment, with no known laryngeal diseases or voice disorders, and no episodes of acute upper respiratory tract infection within the preceding three months, in order to exclude potential pathological influences on the data. In addition, participants were instructed to avoid strenuous voice use on the day prior to the experiment to ensure that phonation during testing was conducted under natural and physiologically stable conditions.

2.3. Experiment Design

To minimize the influence of vowel-related differences on the acoustic signal, all participants were instructed to phonate using the vowel /a/ throughout the experiment. Pitch targets were provided by an electronic pitch reference, and pitch settings followed twelve-tone equal temperament (a1 = 440 Hz). Singing tasks began near the lowest controllable pitch a (a = 220 Hz) and proceeded upward in semitone steps until each voice type reached its naturally producible upper limit.
To ensure data comparability across participants and voice types, pitch levels at High C and above were excluded from subsequent analyses, as instability was observed in this register for some participants. During actual singing, the fundamental frequency exhibited natural fluctuations around the target pitch and did not strictly match the theoretical frequency value at every moment. Therefore, in the analysis, data were grouped according to pitch-class labels (e.g., a–b²) rather than compared on the basis of instantaneous frequency values. Each phonation lasted approximately 2–3 s to ensure the availability of sufficiently long and stable vibration segments for subsequent analysis. Throughout the experiment, participants maintained a natural standing posture with the minimized head movement to ensure stability of the laser measurement point.
Prior to formal data collection, all participants performed brief trial phonations to verify stable laser reflection at the measurement point and proper acoustic signal acquisition, and to familiarize themselves with the experimental pacing and phonation requirements. The entire experiment was conducted using standardized instructions and a fixed experimental procedure to reduce inter-individual variability and enhance data consistency.

2.4. Data Processing

The laryngeal vibration velocity signals obtained from the laser Doppler vibrometer, and the sound pressure signals recorded by the microphone were first imported into analysis software (B&K Connect). For each pitch level, stable phonation segments were selected and preprocessed. A Hanning window was applied to each segment prior to performing a fast Fourier transform (FFT)to get the vibration velocity spectrum and the sound pressure spectrum.
To improve comparability across different pitch levels and across participants, 1/24-octave analysis using standard-compliant digital filters (ISO 266; ANSI S1.11) was applied to the amplitude spectra of both signals. These smoothing preserves harmonic structural details while effectively suppressing minor frequency jitter, inter-individual phonatory variability, and narrowband noise. As a result, the smoothed spectra exhibit clearer trends, providing a stable spectral basis for subsequent analyses of fundamental frequency and harmonic components.

3. Results

3.1. Laryngeal Vibratory Spectrum

Figure 2 shows laryngeal vibration spectrum of Fourier transform (examples shown for male and female participants). A series of relatively regular vibration velocity peaks can be observed, forming a comb-like distribution along the frequency axis. This pattern consists of a prominent dominant peak accompanied by several weaker peaks exhibiting harmonic-like relationships. The resulting laryngeal vibratory frequencies display a spectral appearance that resembles the harmonic structure of a periodic signal. However, they differ from typical acoustic harmonics in terms of amplitude distribution, the number of observable peaks, and the rate of high-frequency energy decay, with spectral energy primarily concentrated in the fundamental component and its second-order multiple.
For descriptive purposes, the prominent peaks in the laryngeal surface vibration velocity spectrum are labeled sequentially as L0, L1, and higher orders, where L0 corresponds to the first major spectral peak and L1 to the second. The frequency spacing between L0 and L1 shows an integer-multiple relationship, whereas the energy associated with L2 and higher-frequency components is sharply reduced. This peak pattern demonstrates stability across participants, with similar comb-like arrangements observed in all spectrums.
Because the amplitudes of L2 and higher-order components were extremely low and close to the noise floor in most samples, particularly during high-pitch phonation, subsequent analyses focused primarily on the two stable and structurally salient components, L0 and L1. In addition, 1/24 octave smoothing was applied to suppress narrowband noise and local spectral spikes arising from spectral discretization, thereby enhancing the clarity of trends in band-energy distribution.

3.2. Relationship between the Laryngeal Vibratory Spectrum and the Sound Pressure Spectrum

Figure 3 and Figure 4 presents 2 examples from tow voice types: soprano and tenor (participants S2 and T2). The figure shows a spectral comparison between the laryngeal vibratory spectrum and the sound pressure spectrum obtained while the two participants phonated at pitch a1. The two dominant peaks observed in the laryngeal vibratory spectrum (L0, L1) consistently correspond to the principal harmonic components in the acoustic spectrum (f0, 2f0), indicating a stable synchronous relationship between laryngeal surface vibration and acoustic output.
Moreover, the integer-multiple peak structure observed in the laryngeal vibratory spectrum suggests that the vibration signal primarily reflects the quasi-periodic vibratory behavior of the vocal folds. In contrast to acoustic spectrum shaped by vocal tract resonance, the laryngeal vibratory spectrum does not exhibit resonance-related energy enhancement bands or a resonant spectral envelope. Therefore, the laryngeal surface vibration signal can be regarded as a representation of vocal fold vibration and associated laryngeal phonatory function during singing, obtained under conditions that are less affected by vocal tract resonance filtering.

3.3. Variation of the L0/L1 Ratio with Pitch: Sex and Voice Type Related Patterns

Based on the above analyses, laryngeal vibration signals can be regarded as relevant indicators of laryngeal function during singing. A systematic examination of the spectrum data across voice types and pitch revealed that the relative relationships of different spectral components in the laryngeal vibratory spectrum do not remain constant as pitch increases. In particular, the velocity ratio between the fundamental vibratory component L0 and its adjacent component L1 exhibited consistent pitch-dependent trends within each sex and within each voice type. Motivated by this observation, the present study further investigated differences in the pitch-related variation of the L0/L1 ratio across sexes and voice types.
To characterize the overall trajectory of L0/L1 ratio changes as a function of pitch, individual ratio curves were summarized at the group level using group means. This approach was intended to extract common trends shared across participants rather than to emphasize absolute values from individual singers. Accordingly, group means were used as descriptive references to illustrate the general pattern of variation.
Inter-individual dispersion was quantified using the standard deviation (SD), reflecting the range of variability across participants rather than uncertainty in population parameter estimation. To avoid instability of mean estimates or potentially misleading visual representations due to limited sample size, group means and standard deviations were calculated only for pitch levels that included valid data from at least three participants (n ≥ 3). Pitch levels with fewer than three valid observations were excluded from group-level statistics, and no interpolation or compensation was applied.

3.3.1. Sex-Specific Criteria for Passaggio Identification

An initial inspection of L0/L1 ratio variation with pitch revealed two distinct patterns across participants. These differences were primarily reflected in the direction and timing of ratio changes rather than in absolute ratio values. Accordingly, participants were analyzed by sex to assess whether passaggio identification criteria differ systematically between females and males.
Sex-averaged L0/L1 ratio values show clear and consistent differences between female and male singers (Figure 5). In the female group (soprano and mezzo-soprano), the ratio generally increases at middle pitch range(downward crossing at mean value 8.666 before pitch g1), reaches a peak in high pitch range, and then the ratio decreases as pitch continues rising, especially at the passaggio pitch part. In contrast, the male group (tenor and baritone) shows less relatively change in the middle pitch range, followed by a clear increase around male passaggio pitches(upward crossing at mean value 3.940 after pitch e2). Thus, the passaggio pitch pattern in males is relatively opposite to females.
Based on these contrasting passaggio patterns, a sex-averaged L0/L1 ratio was used as a mean value to define an operational passaggio criterion, rather than applying a fixed absolute threshold. This approach focuses on relative changes within each sex and reduces the influence of baseline differences between female and male singers.
For female singers, the passaggio is identified at the pitch where the L0/L1 ratio begins to decrease after reaching its peak and crosses downward relative to the sex mean value. For male singers, the passaggio is identified at the pitch where the ratio shifts from a relatively stable state to a clear increase and crosses upward relative to the mean value. Accordingly, passaggio identification is associated with a downward crossing in females and an upward crossing in males.
Overall, although passaggio location in both sexes can be determined using the crossing behavior of the L0/L1 ratio, the direction and timing of this crossing differ systematically between females and males. This sex-specific difference provides a basis for further refinement of passaggio regions in different voice types.

3.3.2. Voice-Type Passaggio Identification Based on Ratio Curves

After establishing the sex-based criterion for passaggio identification, its applicability was further examined across different voice types.
In the female group, both soprano and mezzo-soprano singers showed a similar overall pattern in the L0/L1 ratio, characterized by an initial increase followed by a decrease. In both voice types, passaggio-related events occurred in the pitch range where the ratio curve crossed below the mean line during the descending phase (Figure 6). Although the absolute ratio values and peak magnitudes differed between sopranos and mezzo-sopranos, the general curve shape and the timing of the crossing were consistent. Notably, the crossing region for sopranos was shifted to higher pitches compared with mezzo-sopranos.
In the male group, tenor and baritone singers also exhibited similar patterns. The L0/L1 ratio remained relatively stable in the middle pitch range (normally under mean value line) and increased in the higher pitch range. In both voice types, passaggio-related events were associated with pitch regions where the ratio curve crossed above the mean line (Figure 6). Compared with baritones, this crossing occurred at higher pitches in tenors.
Overall, voice type did not introduce a different passaggio identification criterion. Instead, within the same sex-specific criterion, differences between voice types were reflected mainly as shifts in the pitch range at which the passaggio-related crossing occurred. In other words, sex determined the direction of the crossing relative to the mean line, while voice type influenced where this crossing appeared along the pitch axis.

3.3.3. Individual Passaggio Identification

The previous analyses demonstrated stable patterns in the L0/L1 ratio as a function of pitch at the group level and across sexes. However, passaggio is not only a statistical feature observed in group averages, but a functional event occurring during individual phonatory control. Therefore, it is necessary to further examine the L0/L1 ratio at the individual level to clarify how passaggio can be identified.
Figure 9 presents the L0/L1 ratio curves of one representative singer from each voice type to illustrate how the proposed criterion is applied at the individual level. Final passaggio identification was consistently based on the crossing relationship between the ratio curve and the individual mean line, together with the stability of the ratio behavior following the crossing. By introducing an individual-relative reference and clearly defining the analysis range, this study proposes a non-invasive, repeatable, and robust method for passaggio identification that does not rely on vocal tract resonance characteristics and does not require redefining traditional passaggio pitch ranges.
For individuals, the L0/L1 ratio curves differed substantially across singers in terms of values, peak magnitudes, and local fluctuations. These differences likely reflect individual variability in laryngeal anatomy, neuromuscular control strategies, and habitual singing techniques, indicating that individual passaggio pitch range cannot be defined using a fixed absolute threshold or a single pitch location. To reduce the influence of inter-individual amplitude differences, an individual mean L0/L1 value was introduced as a relative reference baseline for each singer (indicated by dashed lines in Figure 7), representing the overall ratio level across the tested pitch range.
Based on this reference, the passaggio was operationally defined as a range with two adjacent pitches at which the L0/L1 ratio curve crossed the individual mean line. This definition focuses on a change in the regulation state of the ratio rather than on instantaneous extrema or local peaks. It is suitable for the discrete pitch sampling used in the present experiment.
In some individuals, fluctuations or crossings were also observed in the middle pitch range. However, these events varied considerably across singers in both location and shape, often appearing as multiple brief crossings or irregular oscillations with no consistent direction. From the curve morphology, such mid-range variations are more likely associated with gradual adjustments around the first register transition, rather than a single, well-defined functional shift. Because the present passaggio criterion requires a clear change in trend direction and a stable crossing relative to the individual mean line, these irregular mid-range fluctuations did not meet the stability requirements of the operational definition. Accordingly, although the mid-range may contain information related to vocal adjustment, it was not treated as the primary focus of the present analysis. Instead, the analysis was restricted to the higher pitch range (Higher than b1 493.88 Hz), where L0/L1 ratio changes were more concentrated, crossings typically occurred only once, and sex-specific patterns were stable and reproducible across individuals.

4. Discussion

4.1. Considerations on LDV-Based Observation of Laryngeal Vibration

Past studies of passaggio, acoustic signals have been widely used because they are easy to acquire. However, acoustic output is inherently the result of both vocal fold vibration and vocal tract resonance filtering. In the pitch range where passaggio occurs, vocal tract often undergoes noticeable adjustments, which partially changes sound source. In contrast, laryngeal vibration signals recorded by LDV primarily reflect the vibratory response of the larynx driven by quasi-periodic vocal folds and are less directly influenced by vocal tract resonance effects.
In this study, synchronous measurements using a laser Doppler vibrometer and a high-precision microphone confirmed a stable correspondence between laryngeal vibration signals and acoustic signals at the fundamental frequency. This sound source measurement strategy provides complementary evidence for changes in vocal fold vibration patterns during passaggio.

4.2. The Robustness of the L0/L1 Ratio

In the analysis of laryngeal vibration spectrum, this study did not attempt to assign individual spectral peaks to specific anatomical structures or single physiological mechanisms. Instead, from the perspective of signal stability and repeatability, the analysis focused on the most consistently observed low-order harmonic peaks in the spectrum, which were labeled as L0 and L1. These two peaks could be reliably identified across different individuals and pitch conditions, making them suitable reference points for describing laryngeal vibration characteristics.
Rather than examining the absolute amplitude of a single spectral peak, the use of the L0/L1 amplitude ratio helps reduce the influence of measurement conditions and individual physiological differences. This ratio-based approach emphasizes how vibratory energy is distributed across frequency components, rather than relying on absolute signal magnitude, which can vary substantially between singers.
A key advantage of using ratio-based changes lies in its robustness to individual variability. Because singers differ markedly in anatomy, neuromuscular control strategies, and habitual vocal behaviors, a single group-level threshold or absolute numerical criterion is often insufficient for consistent passaggio identification across individuals. By introducing the individual mean L0/L1 value as a relative reference baseline, the passaggio was operationally defined as a L0/L1 value structural crossing event in which the ratio curve shifted relative to its own mean value. This definition shifts the focus of identification from absolute values to changes in regulatory trends, thereby reducing the impact of inter-individual amplitude differences and improving the repeatability of passaggio identification across subjects.

4.3. Considerations on Sex Differences in the L0/L1 Ratio

The sex-related differences observed in the L0/L1 ratio indicate that passaggio behavior is not expressed solely as a pitch-related phenomenon. The opposite directions of L0/L11 ratio change in male and female singers suggest that, although passaggio may appear as a similar register transition, the underlying organization of vibratory energy does not follow the same pattern in sexes. Because the L0/L1 ratio is a relative measure and does not depend on absolute amplitude, these differences cannot be attributed simply to pitch elevation or changes in singing intensity. Instead, they are more likely associated with a reorganization of energy distribution at the sound source.
From another perspective, however, these sex-related differences do not necessarily imply that male and female singers adopt fundamentally different regulatory strategies during passaggio. Under comparable singing tasks, both sexes may follow similar functional control principles. Nevertheless, because female singers typically operate at higher fundamental frequency ranges, the laryngeal tissues, acting as a mechanical transmission medium, may exhibit different transmission and response characteristics for higher-frequency vibratory components. Under such conditions, even if the regulatory strategy is similar at the functional level, the relative distribution of vibratory energy across low-order frequency components at the sound source may be shaped by biomechanical transmission properties, resulting in the observed structural differences in the L0/L1 ratio.

4.4. Considerations on Mid-Range Fluctuations in the L0/L1 Ratio

Inspection of individual L0/L1 ratio curves shows that, in some singers, local fluctuations or brief crossings may also occur in the mid-range. However, these variations typically appear as repeated oscillations rather than as a single, clearly defined regulatory transition. Their structural patterns lack the concentration and directional clarity observed in the higher pitch range.
From the perspective of vocal practice and phonatory control, the mid-range is more likely associated with gradual adjustments occurring near the traditional first register transition, rather than with a single, functionally focused shift in vocal control. Such adjustments tend to unfold progressively and may involve multiple small changes rather than a distinct transition event.
Because the passaggio criterion adopted in the present study emphasizes a clear change in trend direction and a stable crossing relative to the individual mean ratio, these irregular mid-range fluctuations do not meet the requirements of the operational definition of a passaggio event. Accordingly, although the mid-range may contain information relevant to vocal adjustment, it was not treated as the primary focus of the present analysis. Instead, the analysis was restricted to the higher pitch range, where ratio changes were more concentrated, structurally clearer, and more consistent with the proposed passaggio identification criterion.

4.5. Methodological Significance, Scope, and Limitations

It should be emphasized that this study is not aiming to redefine the specific pitch ranges of passaggio as described in traditional vocal pedagogy, and it is not attempting to assign direct anatomical interpretations to individual laryngeal vibration peaks. The primary contribution of this work is proposing an operational, ratio-based approach to identity passaggio for individual singers. This approach provides an objective method for passaggio identification that does not rely on vocal tract resonance characteristics and remains stable under conditions of substantial inter-individual variability.
Within this framework, the study focuses on identifying passaggio as a structured transition in vibratory regulation rather than as a fixed pitch location or an absolute numerical threshold. By emphasizing relative changes within individual vibration patterns, the proposed method offers a repeatable and non-invasive approach of passaggio identification that complements existing acoustic and physiological approaches.
Future studies may build on this work by further investigating the origins of mid-range L0/L1 fluctuations, or by integrating additional physiological measurement techniques to examine the relationship between changes in laryngeal vibration patterns and specific muscular adjustments. Such extensions may help clarify how the observed vibratory transitions relate to underlying phonatory control mechanisms while maintaining the advantages of non-invasive measurement.

5. Conclusions

In this study, laryngeal surface vibration signals and acoustic signals during singing were synchronously measured using a laser Doppler vibrometer (LDV) and a high-precision acoustic system(B&K). After establishing a stable correspondence between the two signals at the fundamental frequency level, the amplitude ratio of low-order harmonic peaks in the laryngeal vibration spectrum (L0/L1) was introduced as a relative descriptor of vibratory behavior.
The results demonstrate that the L0/L1 ratio exhibits a stable and directionally consistent structural transition in the higher pitch range. The crossing of the L0/L1 ratio relative to an individual mean value can be used as an operational criterion for identifying passaggio-related events. Because this ratio-based approach relies on relative changes rather than absolute amplitude values, it shows good robustness to inter-individual variability.
Overall, the proposed method provides a non-invasive and repeatable approach for passaggio identification from a sound source perspective. By focusing on laryngeal vibration patterns rather than vocal tract resonance characteristics, this framework offers a complementary tool for objective analysis of register transitions in singing and may support future applications in vocal research and pedagogy.

Author Contributions

Conceptualization, Haozhen Wen; methodology, Haozhen Wen; validation, Haozhen Wen; formal analysis, Haozhen Wen; investigation, Haozhen Wen; resources, Haozhen Wen; data curation, Haozhen Wen; writing—original draft preparation, Haozhen Wen; writing—review and editing, Haozhen Wen; visualization, Haozhen Wen; project administration, Haozhen Wen; funding acquisition, Haozhen Wen. The author has read and agreed to the published version of the manuscript.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (Ethics Committee) of Shanxi University (protocol code SXULL2025162).

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Elbarougy, R. Acoustic Analysis for Chest-to-Head Register Transition in Singing Voice. IJCA 2019, 177, 11–16. [CrossRef]
  2. Echternach, M.; Sundberg, J.; Arndt, S.; Breyer, T.; Markl, M.; Schumacher, M.; Richter, B. Vocal Tract and Register Changes Analysed by Real-Time MRI in Male Professional Singers-a Pilot Study. Logoped Phoniatr Vocol 2008, 33, 67–73. [CrossRef]
  3. Echternach, M.; Sundberg, J.; Markl, M.; Richter, B. Professional Opera Tenors’ Vocal Tract Configurations in Registers. Folia Phoniatr Logop 2010, 62, 278–287. [CrossRef]
  4. Neumann, K.; Schunda, P.; Hoth, S.; Euler, H.A. The Interplay between Glottis and Vocal Tract during the Male Passaggio. Folia Phoniatr Logop 2005, 57, 308–327. [CrossRef]
  5. Miller, D.G. Registers in Singing. Empirical and Systematic Studies in the Theory of the Singing Voice; Ponsen & Looijen BV, Wageningen: Wageningen, 2000; ISBN 978-90-367-1237-8.
  6. Tao, L. Functional Regulation of the Vocal Folds and the Principles of Passaggio in Singing [in Chinese]. Music Research 2003, (1), 81–84.
  7. Liu, J. Breath Management and Training of Mixed Voice in Singing [in Chinese]. Chinese Music 2003, (4), 88–89.
  8. Qiao, X. Theoretical Boundaries in Crossing Problematic Vocal Registers [in Chinese]. Music Research 2006, (1), 103–108.
  9. Echternach, M.; Burk, F.; Köberlein, M.; Selamtzis, A.; Döllinger, M.; Burdumy, M.; Richter, B.; Herbst, C.T. Laryngeal Evidence for the First and Second Passaggio in Professionally Trained Sopranos. PLOS ONE 2017, 12, e0175865. [CrossRef]
  10. Deliyski, D.D.; Hillman, R.E. State of the Art Laryngeal Imaging: Research and Clinical Implications. Curr Opin Otolaryngol Head Neck Surg 2010, 18, 147–152. [CrossRef]
  11. Švec, J.G.; Sundberg, J.; Hertegård, S. Three Registers in an Untrained Female Singer Analyzed by Videokymography, Strobolaryngoscopy and Sound Spectrography. J. Acoust. Soc. Am. 2008, 123, 347–353. [CrossRef]
  12. Herbst, C.T.; Dunn, J.C. Non-Invasive Documentation of Primate Voice Production Using Electroglottography. Anthropological Science 2018, 126, 19–27. [CrossRef]
  13. Yiu, E.M.L.; Lau, V.C.Y.; Ma, E.P.M.; Chan, K.M.K.; Barrett, E. Reliability of Laryngostroboscopic Evaluation on Lesion Size and Glottal Configuration: A Revisit. The Laryngoscope 2014, 124, 1638–1644. [CrossRef]
  14. Lingala, S.G.; Sutton, B.P.; Miquel, M.E.; Nayak, K.S. Recommendations for Real-Time Speech MRI. Journal of Magnetic Resonance Imaging 2016, 43, 28–44. [CrossRef]
  15. Ikävalko, T.; Laukkanen, A.-M.; McAllister, A.; Eklund, R.; Lammentausta, E.; Leppävuori, M.; Nieminen, M.T. Three Professional Singers’ Vocal Tract Dimensions in Operatic Singing, Kulning, and Edge—A Multiple Case Study Examining Loud Singing. Journal of Voice 2024, 38, 1253.e11-1253.e27. [CrossRef]
  16. Bresch, E.; Narayanan, S. Real-Time Magnetic Resonance Imaging Investigation of Resonance Tuning in Soprano Singing. J. Acoust. Soc. Am. 2010, 128, EL335–EL341. [CrossRef]
  17. Andrade, P.A. Analysis of Male Singers Laryngeal Vertical Displacement During the First Passaggio and Its Implications on the Vocal Folds Vibratory Pattern. Journal of Voice 2012, 26, 665.e19-665.e24. [CrossRef]
  18. Miller, D.G.; Schutte, H.K. ‘Mixing’ the Registers: Glottal Source or Vocal Tract? Folia Phoniatr Logop 2005, 57, 278–291. [CrossRef]
  19. Herbst, C.T. Electroglottography – An Update. Journal of Voice 2020, 34, 503–526. [CrossRef]
  20. Morris, R.J.; Okerlund, D.A.; Craven, E.A. First Passaggio Transition Gestures in Classically Trained Female Singers. Journal of Voice 2016, 30, 377.e21-377.e29. [CrossRef]
  21. Echternach, M.; Burk, F.; Köberlein, M.; Herbst, C.T.; Döllinger, M.; Burdumy, M.; Richter, B. Oscillatory Characteristics of the Vocal Folds Across the Tenor Passaggio. Journal of Voice 2017, 31, 381.e5-381.e14. [CrossRef]
  22. Matsumoto, T.; Kanaya, M.; Ishimura, K.; Tokuda, I.T. Experimental Study of Vocal–Ventricular Fold Oscillations in Voice Production. J. Acoust. Soc. Am. 2021, 149, 271–284. [CrossRef]
  23. Fant, G. Acoustic Theory of Speech Production. 1970, 17.
  24. Rembe, C.; Mignanelli, L. Introduction to Laser-Doppler Vibrometry. In Laser Doppler Vibrometry for Non-Contact Diagnostics; Kroschel, K., Ed.; Springer International Publishing: Cham, 2020; pp. 9–21 ISBN 978-3-030-46691-6.
  25. Tabatabai, H.; Oliver, D.E.; Rohrbaugh, J.W.; Papadopoulos, C. Novel Applications of Laser Doppler Vibration Measurements to Medical Imaging. Sens Imaging 2013, 14, 13–28. [CrossRef]
  26. Avargel, Y.; Cohen, I. Speech Measurements Using a Laser Doppler Vibrometer Sensor: Application to Speech Enhancement; 2011; p. 114;
  27. Kitamura, T.; Ohtani, K. Non-Contact Measurement of Facial Surface Vibration Patterns during Singing by Scanning Laser Doppler Vibrometer. Front. Psychol. 2015, 6. [CrossRef]
Figure 1. Schematic of the experimental setup.
Figure 1. Schematic of the experimental setup.
Preprints 198667 g001
Figure 2. Laryngeal vibratory spectrum measured by LDV for Male and Female singers.
Figure 2. Laryngeal vibratory spectrum measured by LDV for Male and Female singers.
Preprints 198667 g002
Figure 3. (a)Laryngeal vibratory spectrum and (b)sound pressure spectrum of a soprano singer.
Figure 3. (a)Laryngeal vibratory spectrum and (b)sound pressure spectrum of a soprano singer.
Preprints 198667 g003
Figure 4. (a)Laryngeal vibratory spectrum and (b)sound pressure spectrum of a tenor singer.
Figure 4. (a)Laryngeal vibratory spectrum and (b)sound pressure spectrum of a tenor singer.
Preprints 198667 g004
Figure 5. Mean ± SD of the L0/L1 ratio for female and male singers (n ≥ 3).
Figure 5. Mean ± SD of the L0/L1 ratio for female and male singers (n ≥ 3).
Preprints 198667 g005
Figure 6. Mean ± SD of the L0/L1 ratio for soprano(S), mezzo-soprano(M), tenor(T), and baritone(B) voice types (n ≥ 3).
Figure 6. Mean ± SD of the L0/L1 ratio for soprano(S), mezzo-soprano(M), tenor(T), and baritone(B) voice types (n ≥ 3).
Preprints 198667 g006
Figure 7. L0/L1 ratio curves and individual mean lines for four singers (soprano, mezzo-soprano, tenor, and baritone).
Figure 7. L0/L1 ratio curves and individual mean lines for four singers (soprano, mezzo-soprano, tenor, and baritone).
Preprints 198667 g007
Table 1. Participant Age and Classification Data.
Table 1. Participant Age and Classification Data.
Catalogue Participants Age Years of Training
Soprano S1 20 4
S2 36 19
S3 24 8
S4 24 9
S5 19 3
Mezzo-Soprano M1 32 16
M2 23 6
M3 20 3
M4 25 8
M5 25 8
Tenor T1 24 8
T2 19 3
T3 24 9
T4 30 13
T5 24 7
Baritone B1 30 14
B2 19 3
B3 29 11
B4 25 7
B5 20 3
Mean (SD) 24.55 (4.74) 8.05 (4.79)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated