No AccessEditor's AwardJournal of Speech, Language, and Hearing ResearchResearch Article1 Feb 1993

Perceptual Evaluation of Voice Quality

Review, Tutorial, and a Framework for Future Research

    The reliability of listeners’ ratings of voice quality is a central issue in voice research because of the clinical primacy of such ratings and because they are the standard against which other measures are evaluated. However, an extensive literature review indicates that both intrarater and interrater reliability fluctuate greatly from study to study. Further, our own data indicate that ratings of vocal roughness vary widely across individual clinicians, with a single voice often receiving nearly the full range of possible ratings. No model or theoretical framework currently exists to explain these variations, although such a model might guide development of efficient, valid, and standardized clinical protocols for voice evaluation. We propose a theoretical framework that attributes variability in ratings to several sources (including listeners’ backgrounds and biases, the task used to gather ratings, interactions between listeners and tasks, and random error). This framework may guide development of new clinical voice and speech evaluation protocols, ultimately leading to more reliable perceptual ratings and a better understanding of the perceptual qualities of pathological voices.

    References

    • Anders, L.,Holllen, H.,Hurme, P.,Sonnlnen, A., & Wendler, J.(1988). Perception of hoarseness by several classes of listeners.Folia Phoniatrica, 40, 91–100.
    • Arends, N.,Povel, D-J., Os, E. van, & Speth, L. (1990). Predicting voice quality of deaf speakers on the basis of certain glottal characteristics.Journal of Speech and Hearing Research, 33,116–122.
    • Arnold, K. S., & Emanuel, F. (1979). Spectral noise levels and roughness severity ratings for vowels produced by male children.Journal of Speech and Hearing Research, 22, 613–626.
    • Askenfelt, A.G., & Hammarberg, B. (1986). Speech waveform perturbation analysis: A perceptual-acoustical comparison ofseven measures.Journal of Speech and Hearing Research, 29,50–64.
    • Baken, R. J. (1987).Clinical measurement of speech and voice.Boston:College Hill.
    • Bassich, C., & Ludlow, C. (1986). The use of perceptual methods by new clinicians for assessing voice quality.Journal of Speechand Hearing Disorders, 51, 125–133.
    • Berk, R. (1979). Generalizability of behavioral observations: A clarification of interobserver agreement and interobserver reliability.American Journal of Mental Deficiency, 83, 460–472.
    • Berliner, J.E.,Durlach, N. I., & Braida, L. D. (1978). Intensity perception IX: Effect of fixed standard on resolution in identification.Journal of the Acoustical Society of America, 64, 687–689.
    • Brancewicz, T. M., & Reich, A. R. (1989). Speech rate reduction and “nasality” in normal speakers.Journal of Speech and Hearing Research, 32, 837–848.
    • Coleman, R. F. (1969). Effect of median frequency levels upon the roughness of jittered stimuli.Journal of Speech and HearingResearch, 12, 330–336.
    • Coleman, R. F. (1971). Effect of waveform changes upon roughness perception.Folia Phoniatrica, 23, 314–322.
    • Coleman, F., & Wendahl, R. (1967). Vocal roughness and stimulus duration.Speech Monographs, 34, 85–92.
    • Cullinan, W. L.,Prather, E. M., & Williams, D. E. (1963). Comparison of procedures for scaling severity of stuttering.Journal of Speech and Hearing Research, 6,187–194.
    • Darley, F.,Aronson, A., & Brown, J. (1969). Differential diagnostic patterns of dysarthria.Journal of Speech and Hearing Research, 12, 246–269.
    • Deal, R. E., & Belcher, R. A. (1990). Reliability of children’s ratings of vocal roughness.Language, Speech, and Hearing Services in Schools, 21, 68–71.
    • Deal, R., & Emanuel, F. (1978). Some waveform and spectral features of vowel roughness.Journal of Speech and HearingResearch, 21, 250–264.
    • Dunn-Rankin, P. (1983). Scaling methods. Hillsdale,. NJ: Lawrence Erlbaum Associates
    • Ebel, R. (1951). Estimation of the reliability of ratings.Psychomet-rica, 16, 407–424.
    • Emanuel, F.,Lively, M.A., & McCoy, J. (1973). Spectral noise levels and roughness ratings for vowels produced by males and females.Folia Phoniatrica, 25,110–120.
    • Emanuel, F., & Sanaone, F. (1969). Some spectral features of “normal” and simulated “rough” vowels.Folia Phoniatrica, 21,401–415.
    • Emanuel, F., & Scarlnzl, A. (1979). Vocal register effects on vowel spectral noise and roughness: Findings for adult females.Journal of Communication Disorders, 12, 263–272.
    • Emanuel, F., & Smith, W. (1974). Pitch effects on vowel roughness and spectral noise.Journal of Phonetics, 2, 247–253.
    • Fritzell, B.,Hammarberg, B.,Gauffln, J.,Karlseon, I., & Sund-berg, J. (1986). Breathiness and insufficient vocal fold closure.Journal of Phonetics, 14, 549–553.
    • Fukazawa, T., & El-Assuooty, A. (1988). A new index for evaluation of the turbulent noise in pathological voice.Journal of the Acoustical Society of America, 83, 1189–1193.
    • Gelfer, Marylou P(1988). Perceptual attributes of voice: Development and use of rating scales.Journal of Voice, 2, 320–326.
    • Gerratt, B. R.,Krelman, J., Antonanzas-Barroso, N., & Berke, G. (1993). Comparing internal and external standards in voice qualityjudgments.Journal of Speech and Hearing Research, 36,14–20.
    • Gerratt, B. R.,Till, J. A.,Rosenbek, J. C.,Wertz, R. T., & Boysen, A. E. (1991). Use and perceived value of perceptual and instrumental measures in dysarthria management. In C. A. MooreK. M. Yorkston & D. R. Beukelman(Eds.), Dysarthria and apraxia ofspeech (pp. 77–93). Baltimore, MD: Brookes.
    • Hammarberg, B.,Fritzell, B.,Gauffln, J., & Sundberg, J. (1986). Acoustic and perceptual analysis of vocal dysfunction.Journal ofPhonetics, 14, 533–547.
    • Hammarberg, B., Fritzell, B., Gauffln, J., Sundberg, J., & Wedln, L. (1980). Perceptual and acoustic correlates of abnormal voicequalities.Acta Otolaryngologica, 90, 441–451.
    • Hammarberg, B.,Fritzell, B., & Schlratzkl, H. (1984). Teflon injection in 16 patients with paralytic dysphonia: Perceptual andacoustic evaluations.Journal of Speech and Hearing Disorders, 49, 72–82.
    • Hays, W. L. (1973). Statistics for the social sciences (2nd ed.). New York:Holt, Rinehart and Winston.
    • Heiberger, V. L., & Horii, Y. (1982). Jitter and shimmer in sustained phonation. In N. J. Lass (Ed.), Speech and language: Advances inbasic research and practice (Vol. 7, pp. 299–332). New York:Academic Press.
    • Hillenbrand, J. (1988). Perception of aperiodicities in synthetically generated voices.Journal of the Acoustical Society of America,83, 2361–2371.
    • Hirano, M. (1981). Clinical examination of voice.Vienna: Springer.
    • Hopkins, B. L., & Hermann, R. J. (1977). Evaluating interobserver reliability of interval data.Journal of Applied Behavior Analysis, 10,2141–2150.
    • Isshiki, N.,Okamura, H.,Tanabe, M., & Morlmoto, M. (1969). Differential diagnosis of hoarseness.Folia Phoniatrica, 21, 9–19.
    • Isshiki, N., & Takeuchl, Y. (1970). Factor analysis of hoarseness.Studia Phonologica, 5, 37–44.
    • Jensen P. J. (1965, December). Adequacy of terminology for clinical judgment of voice quality deviation.The Eye, Ear, Nose and Throat Monthly, 44, 77–82.
    • Kane, M & Wellen, C. J. (1985). Acoustical measurements and clinical judgments of vocal quality in children with vocal nodules.Folia Phoniatrica, 37, 53–57.
    • Kearns, K., & Simmons, N. (1988). Interobserver reliability and perceptual ratings: More than meets the ear.Journal of Speechand Hearing Research, 31,131–136.
    • Kempster, G. (1984). A multidimensional analysis of vocal quality in two dysphonic groups.Unpublished doctoral dissertation, Northwestern University, Chicago.
    • Kempster G. (1987, November). A comparison of two scales for measuring vocal quality..Paper presented at the Annual Meeting ofthe American Speech-Language-Hearing Association, New Orleans, Louisiana.
    • Kempster, G. B.,Klstler, D. J., & Hillenbrand, J. (1991). Multidimensional scaling analysis of dysphonia in two speakers groups.Journal of Speech and Hearing Research, 34, 534–543.
    • Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers.Journal of the Acoustical Society of America, 87, 820–857.
    • Klich, R. J. (1982). Relationships of vowel characteristics to listener ratings of breathiness.Journal of Speech and Hearing Research,25, 574–580.
    • Kojima, H.,Gould, W.,Lambiase, A., & Isshlkl, N. (1980). Computer analysis of hoarseness.Acta Oto-Laryngologica, 89, 547–554.
    • Kreiman, J.,Gerratt, B. R., & Berke, G. S. (1992). The multidimensional nature of pathologic vocal quality.Unpublished manuscript.
    • Kreiman, J.,Gerratt, B. R., & Precoda, K. (1990). Listener experience and perception of voice qualify.Journal of Speech and Hearing Research, 33, 103–115.
    • Kreiman, J.,Gerratt, B. R.,Precoda, K., & Berke, G. S. (1992). Individual differences in voice qualify perception.Journal ofSpeech and Hearing Research, 35, 512–520.
    • Kreiman, J., & Papcun, G. (1991). Comparing discrimination and recognition of unfamiliar voices.Speech Communication, 10,265–275.
    • Kreul, E. J., & Hecker, M. H. L (1971). Description of the speech of patients with cancer of the vocal fold. Part II: Judgments of ageand voice quality.Journal of the Acoustical Society of America, 49,1283–1287.
    • Ladefoged, P.,Maddieson, I., & Jackson, M. (1988). Investigating phonation types in different languages. In O. Fujimura (Ed.), Vocalfold physiology: Voice production, mechanisms and functions (pp.297–317). New York: Raven Press.
    • Lively, M., & Emanuel, F. (1970). Spectral noise levels and roughness severity ratings for normal and simulated rough vowels produced by adult females.Journal of Speech and Hearing Research, 13, 503–517.
    • Martin, R. R.,Haroldson, S. K., & Trlden, K. A. (1984). Stuttering and speech naturalness.Journal of Speech and Hearing Research, 49, 53–58.
    • Monsen, R. B. (1979). Acoustic qualities of phonation in young hearing-impaired children.Journal of Speech and Hearing Research, 22, 270–288.
    • Montague, J. C., & Hollien, H. (1973). Perceived voice qualify disorders in Down’s Syndrome children.Journal of Communication Disorders, 6, 76–87.
    • Moody, D. K.,Montague, J., & Bradley, B. (1979). Preliminary validity and reliability data on the Wilson Voice Profile System.Language, Speech, and Hearing Services in Schools, 10, 231–240.
    • Moran, M. J., & Gilbert, H. R. (1984). Relation between voice profile ratings and aerodynamic and acoustic parameters.Journal of Communication Disorders, 17, 245–260.
    • Nieboer, G. L.DeGraaf, T., & Schutte, H. K. (1988). Esophageal voice qualify judgments by means of the semantic differential.Journal of Phonetics, 16, 417–436.
    • Papcun, G.,Krelman, J., & Davis, A. (1989). Long-term memory for unfamiliar voices.Journal of the Acoustical Society of America, 85,913–925.
    • Prosek, R. A.,Montgomery, A. A.,Walden, B. E. & Hawkins, D. B,. (1987). An evaluation of residue features as correlates of voice disorders.Journal of Communication Disorders, 20, 108–117.
    • Ptacek, P. H., & Sander, E. K. (1963). Breathiness and phonation length.Journal of Speech and Hearing Disorders, 28, 267–272.
    • Rees, M. (1958). Some variables affecting perceived harshness.Journal of Speech and Hearing Research, 1,155–168.
    • Reich, A., & Lerman, J. (1978). Teflon iaryngoplasty: An acoustical and perceptual study.Journal of Speech and Hearing Disorders, 43, 496–505.
    • Rossi M.Pavlovic C. & Espesser R. (1990, November). Reducing context effects in the subjective evaluation of speech quality.Paper presented at the 120th Meeting of the Acoustical Society of America, San Diego, California.
    • Samar, V., & Metz, D. (1988). Criterion validity of speech intelligibility rating-scale procedures for the hearing-impaired population.Journal of Speech and Hearing Research, 31, 307–316.
    • Sansone, F., Jr., & Emanuel, F. (1970). Spectral noise levels and roughness severity ratings for normal and simulated rough vowels produced by adult males.Journal of Speech and Hearing Research, 13, 489–502.
    • Sapir, S., & Aronson, A. E. (1985). Clinician reliability in rating voice improvement after laryngeal nerve section for spastic dysphonia.Laryngoscope, 95, 200–202.
    • Sapir, S.,Aronson, A. E., & Thomas, J. E. (1986). Judgment of voice improvement after recurrent laryngeal nerve section forspastic dysphonia: Clinicians versus patients.Annals of Otology,Rhinology, and Laryngology, 95,137–141.
    • Schiavetti, N.,Metz, D. E., & Sitter, R. W. (1981). Construct validity of direct magnitude estimation and interval scaling of speechintelligibility: Evidence from a study of the hearing impaired.Journal of Speech and Hearing Research, 24, 441–445.
    • Schiavetti, N.,Sacco, P. R.,Metz, D. E., & Sitter, R. W. (1983). Direct magnitude estimation and interval scaling of stutteringseverity.Journal of Speech and Hearing Research, 26,568–573.
    • Sheard, C.,Adams, R. D., & Davis, P. J. (1991). Reliability and agreement of ratings of ataxic dysarthric speech samples withvarying intelligibility.Journal of Speech and Hearing Research, 34,285–293.
    • Sherman, D., & Llnke, E. (1952). The influence of certain vowel types on degree of harsh voice quality.Journal of Speech andHearing Disorders, 17, 401–408.
    • Shipp, T., & Huntington, D. (1965). Some acoustic and perceptual factors in acute-laryngitic hoarseness.Journal of Speech and Hearing Disorders, 30, 358–359.
    • Shrout, P., & Flelss, J. (1979). Intraclass correlations: Uses in assessing rater reliability.Psychological Bulletin, 86, 420–428.
    • Smith, B.,Weinberg, B.,Feth, L., & Horll, Y. (1978). Vocal roughness and jitter characteristics of vowels produced by esophageal speakers.Journal of Speech and Hearing Research, 21,240–249.
    • Stoicheff M. L.Clampl A. Pass!, J. E., & Fredrickson J. M.(1983). The irradiated larynx and voice: A perceptual study.Journal of Speech and Hearing Research, 26, 482–485.
    • Stone, R. E., & Sharf, D. J. (1973). Vocal change associated with the use of atypical pitch and intensity levels.Folia Phoniatrica, 25,91–103.
    • Takahashi, H., & Koike, Y. (1975). Some perceptual dimensions and acoustic correlates of pathological voices.Acta Otoiaryn-gologica (Suppl. 338), 2–24.
    • Tinsley, H., & Weiss, D. (1975). Interrater reliability and agreement of subjective judgments.Journal of Counseling Psychology, 22,358–376.
    • Toner, M. A., & Emanuel, F. W. (1989). Direct magnitude estimation and equal appearing interval scaling of vowel roughness.Journal of Speech and Hearing Research, 32, 78–82.
    • Wendahl R. W. (1966a). Some parameters of auditory roughness.Folia Phoniatrica, 18, 26–32.
    • Wendahl R. W. (1966b). Laryngeal analog synthesis of jitter and shimmer auditory parameters of harshness.Folia Phoniatrica, 18,98–108.
    • Wendler, J.,Doherty, E. T., & Holllen, H. (1980). Voice classification by means of long-term speech spectra.Folia Phoniatrica, 32, 51–60.
    • Whitehead, R. L., & Emanuel, F. W. (1974). Some spectrographic and perceptual features of vocal fry, abnormally rough, and modalregister vowel phonations.Journal of Communication Disorders, 7, 305–319.
    • Winer, B. J. (1971). Statistical principles in experimental design (2nd ed.). New York: McGraw-Hill.
    • Wolfe, V., & Ratusnlk, D. (1988). Acoustic and perceptual measurements of roughness influencing judgments of pitch.Journal of Speech and Hearing Disorders, 53,15–22.
    • Yumoto, E.,Sasaski, Y., & Okamura, H. (1984). Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness.Journal of Speech and Hearing Research, 27, 2–6.

    Additional Resources