No access
Research Note
February 2011

Automated Measurement of Vocal Fold Vibratory Asymmetry From High-Speed Videoendoscopy Recordings

Publication: Journal of Speech, Language, and Hearing Research
Volume 54, Number 1
Pages 47-54

Abstract

Purpose

In prior work, a manually derived measure of vocal fold vibratory phase asymmetry correlated to varying degrees with visual judgments made from laryngeal high-speed videoendoscopy (HSV) recordings. This investigation extended this work by establishing an automated HSV-based framework to quantify 3 categories of vocal fold vibratory asymmetry.

Method

HSV-based analysis provided for cycle-to-cycle estimates of left–right phase asymmetry, left–right amplitude asymmetry, and axis shift during glottal closure for 52 speakers with no vocal pathology producing comfortable and pressed phonation. An initial cross-validation of the automated left–right phase asymmetry measure was performed by correlating the measure with other objective and subjective assessments of phase asymmetry.

Results

Vocal fold vibratory asymmetry was exhibited to a similar extent in both comfortable and pressed phonations. The automated measure of left–right phase asymmetry strongly correlated with manually derived measures and moderately correlated with visual–perceptual ratings. Correlations with the visual–perceptual ratings remained relatively consistent as the automated measure was derived from kymograms taken at different glottal locations.

Conclusions

An automated HSV-based framework for the quantification of vocal fold vibratory asymmetry was developed and initially validated. This framework serves as a platform for investigating relationships between vocal fold tissue motion and acoustic measures of voice function.

Get full access to this article

View all available purchase options and get full access to this article.

References

Bless, D. M., Hirano, M., & Feder, R. J. (1987). Videostroboscopic evaluation of the larynx. Ear, Nose & Throat Journal, 66, 289–296.
Bonilha, H. S., Deliyski, D. D., & Gerlach, T. T. (2008). Phase asymmetries in normophonic speakers: Visual judgments and objective findings. American Journal of Speech-Language Pathology, 17, 367–376.
Deliyski, D. D. (2005). Endoscope motion compensation for laryngeal high-speed videoendoscopy. Journal of Voice, 19, 485–496.
Deliyski, D. D., Petrushev, P. P., Bonilha, H. S., Gerlach, T. T., Martin-Harris, B., & Hillman, R. E. (2008). Clinical implementation of laryngeal high-speed videoendoscopy: Challenges and evolution. Folia Phoniatrica et Logopaedica, 60, 33–44.
Gallivan, G. J., Gallivan, H. K., & Eitnier, C. M. (2008). Dual intracordal unilateral vocal fold cysts: A perplexing diagnostic and therapeutic challenge. Journal of Voice, 22, 119–124.
Haben, C. M., Kost, K., & Papagiannis, G. (2003). Lateral phase mucosal wave asymmetries in the clinical voice laboratory. Journal of Voice, 17, 3–11.
Hillman, R. E., Montgomery, W. W., & Zeitels, S. M. (1997). Appropriate use of objective measures of vocal function in the multidisciplinary management of voice disorders. Current Opinion in Otolaryngology & Head and Neck Surgery, 5, 172–175.
Isshiki, N., Tanabe, M., Ishizaka, K., & Broad, D. (1977). Clinical significance of asymmetrical vocal cord tension. Annals of Otology, Rhinology and Laryngology, 86, 58–66.
Kreiman, J., Gerratt, B. R., & Ito, M. (2007). When and why listeners disagree in voice quality assessment tasks. The Journal of the Acoustical Society of America, 122, 2354–2364.
Lohscheller, J., & Eysholdt, U. (2008). Phonovibrogram visualization of entire vocal fold dynamics. Laryngoscope, 118, 753–758.
Lohscheller, J., Eysholdt, U., Toy, H., & Döllinger, M. (2008). Phonovibrography: Mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics. IEEE Transactions on Medical Imaging, 27, 300–309.
Lohscheller, J., Toy, H., Rosanowski, F., Eysholdt, U., & Döllinger, M. (2007). Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos. Medical Image Analysis, 11, 400–413.
Mehta, D. D., Deliyski, D. D., Zeitels, S. M., Quatieri, T. F., & Hillman, R. E. (2010). Voice production mechanisms following phonosurgical treatment of early glottic cancer. Annals of Otology, Rhinology and Laryngology, 119, 1–9.
Moukalled, H. J., Deliyski, D. D., Schwarz, R. R., & Wang, S. (2009). Segmentation of laryngeal high-speed videoendoscopy in temporal domain using paired active contours. In Manfredi, C. (Ed.), Proceedings of the 6th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 6(pp. 137–140). Firenze, Italy: Firenze University Press.
Niimi, S., & Miyaji, M. (2000). Vocal fold vibration and voice quality. Folia Phoniatrica et Logopaedica, 52, 32–38.
Qiu, Q., Schutte, H. K., Gu, L., & Yu, Q. (2003). An automatic method to quantify the vibration properties of human vocal folds via videokymography. Folia Phoniatrica et Logopaedica, 55, 128–136.
Shaw, H. S., & Deliyski, D. D. (2008). Mucosal wave: A normophonic study across visualization techniques. Journal of Voice, 22, 23–33.
Society of Motion Picture and Television Engineers (SMPTE). (2004). SMPTE Standard for Television—Composite analog video signal—NTSC for studio applications [SMPTE Standard No. 170M-2004 (Revision of SMPTE 170M-1999)]. Available at http://www.techstreet.com/standards/smpte/170m_2004?product_id=1228846
švec, J. G., šram, F., & Schutte, H. K. (2007). Videokymography in voice disorders: What to look for? Annals of Otology, Rhinology and Laryngology, 116, 172–180.
Verdonck-de Leeuw, I. M., Festen, J. M., & Mahieu, H. F. (2001). Deviant vocal fold vibration as observed during videokymography: The effect on voice quality. Journal of Voice, 15, 313–322.
Yan, Y., Damrose, E., & Bless, D. (2007). Functional analysis of voice using simultaneous high-speed imaging and acoustic recordings. Journal of Voice, 21, 604–616.
Zhang, Y., Bieging, E., Tsui, H., & Jiang, J. J. (2010). Efficient and effective extraction of vocal fold vibratory patterns from high-speed digital imaging. Journal of Voice, 24, 21–29.

Information & Authors

Information

Published In

Journal of Speech, Language, and Hearing Research
Volume 54Number 1February 2011
Pages: 47-54

History

  • Received: Feb 1, 2010
  • Accepted: Jun 23, 2010
  • Published in issue: Feb 1, 2011

Permissions

Request permissions for this article.

Key Words

  1. vocal fold
  2. assessment
  3. endoscopy
  4. voice

Authors

Affiliations

Daryush D. Mehta [email protected]
Massachusetts General Hospital, Boston; Massachusetts Institute of Technology, Cambridge; and MIT Lincoln Laboratory, Lexington, MA
Dimitar D. Deliyski
University of South Carolina, Columbia
Thomas F. Quatieri
Robert E. Hillman
Massachusetts General Hospital; Massachusetts Institute of Technology; and Harvard Medical School, Boston, MA

Notes

Contact author: Daryush D. Mehta, Massachusetts General Hospital, Center for Laryngeal Surgery and Voice Rehabilitation, 1 Bowdoin Square, 11th Floor, Boston, MA 02114. E-mail: [email protected].

Metrics & Citations

Metrics

Article Metrics
View all metrics



Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Citing Literature

  • GIRAFE: Glottal imaging dataset for advanced segmentation, analysis, and facilitative playbacks evaluation, Data in Brief, 10.1016/j.dib.2025.111376, 59, (111376), (2025).
  • Vocal Fold Kinematics and Convergent–Divergent Oscillatory Glottis: Basic Insights Using Mucosal Wave Modeling and Synthetic Kymograms, Journal of Speech, Language, and Hearing Research, 10.1044/2024_JSLHR-24-00251, 68, 4, (1602-1617), (2025).
  • Uncertainty of Spatial Segmentation of High-Speed Videoendoscopy and Its Temporal and Spatial Dependency, Journal of Voice, 10.1016/j.jvoice.2025.03.007, (2025).
  • Study of Glottal Attack Time and Glottal Offset Time in Neurogenic Voice Disorders During Sustained Phonation, Journal of Voice, 10.1016/j.jvoice.2025.02.030, (2025).
  • Deep-Learning-Based Representation of Vocal Fold Dynamics in Adductor Spasmodic Dysphonia during Connected Speech in High-Speed Videoendoscopy, Journal of Voice, 10.1016/j.jvoice.2022.08.022, 39, 2, (570.e1-570.e15), (2025).
  • Empirical Distribution of Glottal Edges (EDGE): A Statistical Assessment of Vocal Fold Kinematics Using High-Speed Videoendoscopy, IEEE Journal of Biomedical and Health Informatics, 10.1109/JBHI.2024.3462632, 29, 2, (1087-1100), (2025).
  • Dynamic behavior of a two-mass nonlinear fractional-order vibration system, Frontiers in Physics, 10.3389/fphy.2024.1452138, 12, (2024).
  • Supraglottic Laryngeal Maneuvers in Adductor Laryngeal Dystonia During Connected Speech, Journal of Voice, 10.1016/j.jvoice.2024.08.009, (2024).
  • Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach, Journal of Voice, 10.1016/j.jvoice.2022.01.028, 38, 4, (951-962), (2024).
  • Multivariate Analysis of Vocal Fold Vibrations in Normal Speakers Using High-Speed Digital Imaging, Journal of Voice, 10.1016/j.jvoice.2021.08.002, 38, 1, (10-17), (2024).
  • See more

View Options

Sign In Options

ASHA member? If so, log in with your ASHA website credentials for full access.

Member Login

View options

PDF

View PDF

Full Text

View Full Text

Figures

Tables

Media

Share

Share

Copy the content Link

Share