A large percentage of patients who have undergone laryngectomy to treat advanced laryngeal cancer rely on an electrolarynx (EL) to communicate verbally. Although serviceable, EL speech is plagued by shortcomings in both sound quality and intelligibility. This study sought to better quantify the relative contributions of previously identified acoustic abnormalities to the perception of degraded quality in EL speech. Ten normal listeners evaluated the sound quality of EL speech tokens that had been acoustically enhanced by (a) increased low-frequency energy, (b) EL-noise reduction, and (c) fundamental frequency variation to mimic normal pitch intonation in relation to nonenhanced EL speech, normal speech, and normal monotonous speech (fundamental frequency variation removed). In comparing all possible combinations of token pairs, listeners were asked to identify which one of each pair sounded most like normal natural speech, and then to rate on a visual analog scale how different the chosen token was from normal speech. The results indicate that although EL speech can be most improved by removing the EL noise and providing proper pitch information, the resulting quality is still well below that of normal natural speech or even that of monotonous natural speech. This suggests that, in addition to the widely acknowledged acoustic abnormalities examined in this investigation, there are other attributes that contribute significantly to the unnatural quality of EL speech. Such additional factors need to be clearly identified and remedied before EL speech can be made to more closely approximate the sound quality of normal natural speech.

References

  • Barney, H. L., Haworth, F. E., & Dunn, H. K. (1959). An experimental transistorized artificial larynx ..Weinberg (Ed.) Readings in speech following total laryngectomy (pp. 1337–1356). Baltimore: University Park Press.
  • Cole, D., Sridharan,S., Moody, M., & Geva, S. (1997). Application of noise reduction techniques for alaryngeal speech enhancement..Proceedings of IEEE TENCON ”97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications, 2, 491–494.
  • Crystal, T. H., & House, A. S. (1982). Segmental durations in connected speech signals: Preliminary results.Journal of the Acoustical Society of America, 72, 705–717.
  • Diedrich, W., & Youngstrom, K. (1977). Alaryngeal speech. Springfield, IL: Charles C Thomas..
  • Edwards, A. L. (1957). Techniques of attitude scale construction.New York: Appleton-Century-Crofts.
  • Espy-Wilson, C. Y., Chari, V. R., MacAuslan, J. M., Huang, C. B., & Walsh, M. J. (1998). Enhancement of electrolaryngeal speech by adaptive filtering.Journal of Speech, Language, and Hearing Research, 41, 1253–1264.
  • Gates, G. A., Ryan, W., Cantu, E., & Hearne, E. (1982). Current status of laryngectomee rehabilitation: II. Causes of failure.American Journal of Otolaryngology, 3, 8–14.
  • Gates, G. A., Ryan, W., Cooper, J. C., Jr., Lawlis, G. F., Cantu, E., & Hayashi, T., et al. (1982). Current status of laryngectomee rehabilitation: I. Results of therapy.American Journal of Otolaryngology, 3, 1–7.
  • Gray, S., & Konrad, H. R. (1976). Laryngectomy: Postsurgical rehabilitation of communication.Archives of Physical Medicine and Rehabilitation, 57, 140–142.
  • Harris, R. J. (2001). A primer of multivariate statistics.Mahwah, NJ: Erlbaum.
  • Heaton, J., Goldstein, E., Kobler, J., Zeitels, S., Randolph, G., & Walsh, M., et al. (2004). Surface electromyographic activity in total laryngectomees following laryngeal nerve transfer to neck strap muscles: Correlation with vocal and non-vocal behaviors..Annals of Otology, Rhinology and Laryngology, 109, 972–980.
  • Hillman, R. E., Walsh, M. J., Wolf, G. T., Fisher, S. G., & Hong, W. K. (1998). Functional outcomes following treatment for advanced laryngeal cancer. Part I–Voice preservation in advanced laryngeal cancer. Part II– Laryngectomy rehabilitation: The state of the art in the VA System. Research Speech-Language Pathologists. Department of Veterans Affairs Laryngeal Cancer Study Group.Annals of Otology, Rhinology and Laryngology Supplement, 172, 1–27.
  • House, A. S., & Stevens, K. N. (1958). Estimation of formant bandwidths from measurements of transient response of the vocal tract.Journal of Speech and Hearing Research, 1, 309–315.
  • Kaiser, H. F., & Serlin, R. H. (1978). Contributions to the method of paired comparisons.Applied Psychological Measurement, 2, 421–430.
  • King, P. S., Fowlks, E. W., & Peirson, G. A. (1968). Rehabilitation and adaptation of laryngectomy patients..American Journal of Physical Medicine and Rehabilitation, 47, 192–203.
  • Kommers, M. S., & Sullivan, M. D. (1979). Wives” evaluation of problems related to laryngectomy.Journal of Communication Disorders, 12, 411–430.
  • Kreiman, J., Gerratt, B. R., Kempster, G. B., Erman, A., & Berke,G. (1993). Perceptual evaluation of voice quality: Review, tutorial, and a framework for future research.Journal of Speech and Hearing Research, 36, 21–40.
  • Krus, D. J., & Krus, P. (1977). Normal scaling of dominance matrices: The domain-referenced model.Educational and Psychological Measurement, 37, 189–193.
  • Ma, K., Demirel, P., Espy-Wilson, C., & MacAuslan, J. (1999). Improvement of electrolarynx speech by introducing normal excitation information.Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH), Budapest, 1999, 323–326.
  • McCree, A. V., & Barnwell, T. P. (1995). A mixed excitation LPC vocoders model for low bit rate speech coding.Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing ”95, 3, 242–250.
  • Meltzner, G. S. (2003). Perceptual and acoustic impacts of aberrant properties of electrolaryngeal speech. (Doctoral dissertation, Massachusetts Institute of Technology, 2003).Dissertation Abstracts International, 64, 4486.
  • Meltzner, G. S., Hillman, R. E., Heaton, J., Houston, K., Kobler, J. B., & Qi, Y. (2005). Electrolarynx speech: The state-of-the-art and future directions for development .In P. C. Doyle & R. L. Keith (Eds.) Contemporary considerations in the treatment and rehabilitation of head and neck cancer: Voice, speech, and swallowing. Austin, TX: Pro-Ed.
  • Meltzner, G. S., Kobler, J. B., & Hillman, R. E. (2003). Measuring the neck frequency response function of laryngectomy patients: Implications for the design of electrolarynx devices.Journal of the Acoustical Society of America, 114, 1035–1047.
  • Mitchell, H. L., Hoit, J. D., & Watson, P. J. (1996). Cognitive–linguistic demands and speech breathing.Journal of Speech and Hearing Research, 39, 93–104.
  • Morris, H. L., Smith, A. E., Van Demark, D. R., & Maves, M. D. (1992). Communication status following laryngectomy: The Iowa experience 1984-1987.Annals of Otology, Rhinology and Laryngology, 101, 503–510.
  • Mosteller, F. (1951). Remarks on the method of paired comparisons II: The effect of an aberrant standard deviation when equal standard deviations and equal correlations are assumed.Psychometrika, 16, 207–218.
  • Moulines, E., & Charpentier, F. (1990). Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones.Speech Communication, 9, 453–467.
  • Poulton, E. C. (1989). Bias in quantifying judgments.Hove, U.K.: Erlbaum.
  • Qi, Y. Y., & Weinberg, B. (1991). Low-frequency energy deficit in electrolaryngeal speech.Journal of Speech and Hearing Research, 34, 1250–1256.
  • Richardson, J., & Bourque, L. (1985). Communication after laryngectomy.Journal of Psychosocial Oncology, 3, 83–97.
  • Sisty, N. L., & Weinberg, B. (1972). Formant frequency characteristics of esophageal speech.Journal of Speech and Hearing Research, 15, 439–448.
  • Stevens, K. N. (1998). Acoustic phonetics.Cambridge, MA: MIT Press.
  • Thurstone, L. L. (1927). A law of comparative judgment. Psychology Review. 34, 273–286.
  • Torgerson, W. S. (1957). Theory and methods of scaling.New York: Wiley.
  • Uemi, N., Ifukube, T., Takahashi, M., & Matsushima, J. (1994). Design of a new electrolarynx having a pitch control function.IEEE Workshop on Robot and Human, 198–202.
  • U.S. Department of Defense. (1999). Analog-to-digital conversion of voice by 2,400 bit/second mixed excitation linear prediction (MELP) (MIL-STD-3005).Philadelphia: Author.
  • Webster, P. M., & Duguay, M. J. (1990). Surgeons” reported attitudes and practices regarding alaryngeal speech.Annals Otology, Rhinology, and Laryngology, 99, 197–200.
  • Weiss, M. S., & Basili, A. G. (1985). Electrolaryngeal speech produced by laryngectomized subjects: Perceptual characteristics.Journal of Speech and Hearing Research, 28, 294–300.
  • Weiss, M. S., Yeni-Komshian, G. H., & Heinz, J. M. (1979). Acoustical and perceptual characteristics of speech produced with an electronic artificial larynx.Journal of the Acoustical Society of America, 65, 1298–1308.

Additional Resources