School of Medicine

Department of Otolaryngology

Voice Center Research

ikuma
kunduk
mcwhorter

 

voice

 

Dr. Takeshi Ikuma and Dr. Melda Kunduk will be attending the International Conference on Voice Physiology and Biomechanics in Erlangen, Germany to make the following presentations which were accepted for both podium and poster sessions. The exposure and collaboration opportunities and learning are invaluable to the team and our continued progress.

Evaluation of Machine-Learning Pitch Estimation Algorithms

Takeshi Ikuma, PhD, Andrew J. McWhorter, MD, Melda Kunduk, PhD

Summary: Accurate pitch estimation is vital for objective voice analysis for its parameters to represent the state of voice accurately. Conventional pitch detectors primarily analyze the periodicity of the input signal. This approach of pitch estimation often fails if a voice signal contains subharmonics, a common trait of pathological voices. This study demonstrates that machine-learning (ML) based pitch detector improves the subharmonics detection over the existing Praat pitch detector (reduces the misdetection by 84%). Incorporation of an ML detector in the voice analysis would improve the quality of the acoustic analysis and other objective voice assessments.

Voice Onset and Offset Oscillatory Patterns Based on Place Manner, and Voiced/Voiceless Consonant Environment

Melda Kunduk, Takeshi Ikuma, Andrew J McWhorter, Tobias Schraut, Michael Döllinger, Robin Samlan

Summary: During speech production, the vocal folds (VFs) undergo almost continuous adduction and abduction to generate the voiced and unvoiced segments. Each voiced segment consists of vibratory patterns that include onset and offset and possibly a short duration of vibration. The mechanics and kinematics of the vocal onset and offset of oscillations potentially contain rich information that can be related to vocal health, dysfunction, and patients’ specific voice symptoms such as vocal fatigue and handicap (1,2,3). This presentation demonstrates effects of English consonants varied by their place, manner, and voicing on the onset and offset VF oscillation patterns in a vowel-consonant-vowel (VCV) non-word syllable. Preliminary results strongly suggest main effects of voicing, place, and manner and interaction effects between voicing and place and between voicing and manner on the vocal fold vibratory patterns of the VCV productions.

Vocal Transient Analysis Using Piecewise Linear Approximation

 Takeshi Ikuma, PhD, Andrew J. McWhorter, MD, Robin Samlan, Tobias Schraut, Michael Döllinger, and Melda Kunduk, PhD

Summary: The mechanics and kinematics of vocal fold vibration during the transients—either onset or offset of sustained phonation or in speech context—potentially contain rich information that can be related to vocal health, dysfunction, and patients specific voice symptoms. Measuring objective features of the transients from real speakers can be challenging, especially in speech context, because the non-transient segments are also dynamic with constantly fluctuating amplitude and frequency. Thus, there is no clear reference point to determine when the signal is out of the transient reaching so-called “steady state.” This presentation proposes a method to establish objectively the transient/steady-state boundary by fusing the vibration amplitude envelope and instantaneous fundamental frequency estimates with piecewise linear approximation. Use of a piecewise linear function is a simple and effective way to unify the handling of vocal transitions influenced by both voiced and voiceless consonants.


 

voice

 

Dr. Takeshi Ikuma will be attending the Voice Foundation 53rd Annual Symposium in Philadelphia, PA to make the following presentations which were accepted for both podium and poster sessions. The exposure and collaboration opportunities and learning are invaluable to the team and our continued progress.

Subharmonics in Normal Voice: An Acoustic Database Study

Takeshi Ikuma, PhD, Andrew J. McWhorter, MD, Melda Kunduk, PhD

Summary: While subharmonic phonation is commonly observed in disordered voice, speakers without any voice concern can also produce a voice with subharmonics both voluntarily and involuntarily. This leads to the importance of establishing a statistical baseline for subharmonics-specific acoustic parameters among normal voice population. This presentation reports the findings from our acoustic analysis of the normal voice samples from four voice databases. The presence of short subharmonic bursts was not uncommon although strong ones were rare for normal voices.

Automatic Detection of Subharmonics in Acoustic Voice Signal

Takeshi Ikuma, PhD, Andrew J. McWhorter, MD, Melda Kunduk, PhD

Summary: Among many possible modes of irregular vocal fold oscillation, those with subharmonics warrant special attention because both normal and pathological vocal folds can produce such voice. As such, a capable subharmonic detector can enhance the clinical acoustic analysis either by discriminating subharmonics from other irregularities or by identifying the abnormal attributes during subharmonic phonation. Currently, none of the available voice analysis software provides reliable subharmonic detection for acoustic signals to our knowledge. This presentation presents a novel model-based detector with minimum description length (MDL) information criterion and its performance in Monte Carlo experiment with synthetic voice signals.

Synthesis of Subharmonic Voice with Kinematic Vocal Fold Model

Takeshi Ikuma, PhD, Andrew J. McWhorter, MD, Melda Kunduk, PhD

Summary: Frequent subharmonic vocal fold vibration is a common sign of a voice disorder but is also occasionally present among those without voice concerns. This motivates investigation into how such irregular vibration translates to the radiated acoustic signal and to our perception. A useful tool for studying subharmonic vibration is the kinematic vocal fold model which is coupled aerodynamically and acoustically with the wave-reflection vocal tract model. This model grants a user with a fine-grained control over the irregular vocal fold motion while the nonlinear interaction of acoustic waves and aerodynamic quantities induces a realistic combination of subharmonic tones in the output acoustic signal. This presentation will describe the construction of the model and its application to relate the source modulation extent to the output subharmonics-to-harmonics ratio.