No access
Research Article
Research Article
August 1994

Speaker Race Identification From Acoustic Cues in the Vocal Signal

Publication: Journal of Speech, Language, and Hearing Research
Volume 37, Number 4
Pages 738-745

Abstract

One-second acoustic samples were extracted from the mid-portion of sustained /a/ vowels produced by 50 black and 50 white adult males. Each vowel sample from a black subject was randomly paired with a sample from a white subject. From the tape-recorded samples alone, both expert and naive listeners could determine the race of the speaker with 60% accuracy. The accuracy of race identification was independent of the listener’s own race, sex, or listening experience. An acoustic analysis of the samples revealed that, although within ranges reported by previous studies of normal voices, the black speakers had greater frequency perturbation, significantly greater amplitude perturbation, and a significantly lower harmonics-to-noise ratio than did the white speakers. The listeners were most successful in distinguishing voice pairs when the differences in vocal perturbation and additive noise were greatest and were least successful when such differences were minimal or absent. Because there were no significant differences in the mean fundamental frequency or formant structure of the voice samples, it is likely that the listeners relied on differences in spectral noise to discriminate the black and white speakers.

Get full access to this article

View all available purchase options and get full access to this article.

References

Baugh, J. (1983). Black street speech: Its history, structure, and survival (pp. 54–107). Austin, TX: University of Texas Press.
Beckett, R. L. (1969). Pitch perturbation as a function of subjective vocal constriction. Folia Phoniatrica, 21, 416–425.
Boshoff, P. H. (1945). The anatomy of the South African Negro larynx. South African Journal of Medical Sciences, 10, 113–119.
Brenner, M., Shipp, T., Doherty, E. T., & Morrissey, P. (1985). Voice measures of physiological stress—Laboratory and field data. In I. R. Titze & R. C. Scherer (Eds.), Vocal fold physiology: Biomechanics, acoustics and phonatory control (pp. 239–248). Denver, CO: Denver Center for the Performing Arts.
Bryden, J. (1968). An acoustic and social dialect analysis of perceptual variables in listener identification and rating of Negro speakers. Unpublished doctoral dissertation, University of Virginia, Charlottesville, VA.
Deal, R., & Emanuel, F. (1978). Some waveform and spectral features of vowel roughness. Journal of Speech and Hearing Research, 21, 250–264.
Dillard, J. L. (1972). Black English: Its history and usage in the United States (pp. 39–72, 186–228). New York: Vintage.
Dillard, J. L. (1977). Lexicon of Black English. New York: Seabury Press.
Doherty, E.T., & Shipp, T. (1988). Tape recorder effects on jitter and shimmer extraction. Journal of Speech and Hearing Research, 31, 485–490.
Ducote, C. (1983). A study of the reading and speaking fundamental frequency of aging Black adults. Unpublished doctoral dissertation, Louisiana State University, Baton Rouge, LA.
Duifjuis, H., Willems, L. F., & Sluyter, R. J. (1982). Measurement of pitch in speech: An implementation of Goldstein’s theory of pitch perception. Journal of the Acoustical Society of America, 71, 1568–1580.
Fasold, R. (1981). The relation between black and white speech in the South. American Speech, 56, 163–189.
Fletcher, H. (1934). Loudness, pitch, and the timbre of musical tones and their relation to the intensity, the frequency, and the overtone structure. Journal of the Acoustical Society of America, 6, 59–69.
Gulick, W., Gescheider, G., & Frisina, R. (1989). Hearing: Physiological acoustics, neural coding, and psychoacoustics (pp. 253–254). New York: Oxford University Press.
Hanley, T. D. (1951). An analysis of vocal frequency and duration characteristics of selected samples of speech from three American dialect regions. Speech Monographs, 12, 78–93.
Herman, L., & Herman, M. S. (1947). American dialects (pp. 185–248). New York: Theatre Arts Books.
Hollien, H., & Malcik, E. (1962). Adolescent voice change in southern Negro males. Speech Monographs, 29, 53–58.
Hollien, H., & Shipp, T. (1972). Speaking fundamental frequency and chronologic age in males. Journal of Speech and Hearing Research 15, 15–159.
Horiguchi, S., Haji, T., Baer, T., & Gould, W. J. (1987). Comparison of electroglottographic and acoustic waveform perturbation measures. In T. Baer, C. Sasaki, & K. S. Harris (Eds.), Laryngeal function in phonation and respiration (pp. 509–518). Boston, MA: College-Hill Press.
Horii, Y. (1979). Fundamental frequency perturbation observed in sustained phonation. Journal of Speech and Hearing Research, 15, 155–159.
Horii, Y. (1980). Vocal shimmer in sustained phonation. Journal of Speech and Hearing Research, 23, 202–209.
Horii, Y. (1982). Jitter and shimmer differences among sustained vowel phonations. Journal of Speech and Hearing Research, 25, 12–14.
Hudson, A., & Holbrook, A. (1981). A study of the reading fundamental frequency of young Black adults. Journal of Speech and Hearing Research, 24, 197–201.
Hudson, A., & Holbrook, A. (1982). Fundamental frequency characteristics of young Black adults: Spontaneous speaking and oral reading. Journal of Speech and Hearing Research, 25, 25–28.
Imaizumi, S. (1986a). Acoustic measures of roughness in pathological voice. Journal of Phonetics, 14, 457–462.
Imaizumi, S. (1986b). Spectrographs evaluation of laryngeal pathology. In C. W. Cummings & J. M. Fredrickson (Eds.), Otolaryngology: Head and neck surgery, vol. 3 (pp. 1838–1845). St. Louis, MO: C. V. Mosby.
Kempster, G. (1984). A multidimensional analysis of vocal quality in two dysphonic groups (doctoral dissertation, Northwestern University, 1984). Dissertation Abstracts International, 45, 3789B.
Kempster, G., & Kistler, D. (1984). Perceptual dimensions of dysphonic voices. Journal of the Acoustical Society of America, 75, S8.
Kitajima, K., & Gould, W. J. (1976). Vocal shimmer in sustained phonation of normal and pathologic voice. Annals of Otolaryngology, 85, 377–381.
Koike, Y. (1973). Application of some acoustic measures for the evaluation of laryngeal dysfunction. Studia Phonologies, 7, 17–23.
Krapp, G. P. (1924). The English of the Negro. American Mercury, 2, 190–195.
Kreiman, J., Gerratt, B. R., Kempster, G. B., Erman, A., & Berke, G. S. (1993). Perceptual evaluation of voice quality: Review, tutorial, and a framework for future research. Journal of Speech and Hearing Research, 36, 21–40.
Labov, W. (1983). Recognizing Black English in the classroom. In J. W. Chambers, Jr. (Ed.), Black English: Educational equality and the law (pp. 29–55). Ann Arbor, MI: Karoma.
Lass, N. J., Almerino, C. A., Jordan, J. F., & Walsh, J. M. (1980). The effect of filtered speech on speaker race and sex identifications. Journal of Phonetics, 8, 101–112.
Lass, N. J., Tecca, J., Mancuso, R., & Black, W. (1979). The effect of phonetic complexity on speaker race and sex identifications. Journal of Phonetics, 7, 105–118.
Mayo, R. (1990). Fundamental frequency and vowel formant frequency characteristics of normal African-American and European-American adults. Unpublished doctoral dissertation, Memphis State University, Memphis, TN.
Miller, R. (1981). Simultaneous statistical inference (pp. 67–75). New York: Springer-Verlag.
Montague, J. C., Jr., Hollien, H., Hollien, P. A., & Wold, D. C. (1978). Perceived pitch and fundamental frequency comparisons of institutionalized Down’s syndrome children. Folia Phoniatrica, 30, 245–256.
Murry, T. (1988). Vocal tract parameters associated with voice quality preference. Journal of Voice, 2, 111–117.
Murry, T., Brown, W. S., Jr., & Rothman, H. (1987). Judgments of voice quality and preference: Acoustic interpretations. Journal of Voice, 1, 252–257.
Muta, H., Baer, T., Wagatsuma, K., Muraoka, T., & Fukuda, H. (1988). A pitch-synchronous analysis of hoarseness in running speech. Journal of the Acoustical Society of America, 84, 1292–1301.
Muta, H., Muraoka, T., Wagatsuma, K., Horiguchi, M., Fukuda, H., Takayama, E., Fujloka, T., & Kanou, S. (1987). Analysis of hoarse voices using the LPC method. In T. Baer, C. Sasaki, & K. S. Harris (Eds.), Laryngeal function in phonation and respiration (pp. 463–474). Boston, MA: College-Hill Press.
Mysak, E. D. (1959). Pitch and duration characteristics of older males. Journal of Speech and Hearing Research, 2, 46–54.
Orlikoff, R. F. (1990a). The relationship of age and cardiovascular health to certain acoustic characteristics of male voices. Journal of Speech and Hearing Research, 33, 450–457.
Orlikoff, R. F. (1990b). Vowel amplitude variation associated with the heart cycle. Journal of the Acoustical Society of America, 88, 2091–2098.
Orlikoff, R. F., & Baken, R. J. (1989). The effect of the heartbeat on vocal fundamental frequency perturbation. Journal of Speech and Hearing Research, 32, 576–582.
Roberts, M. (1966). The pronunciation of vowels in Negro speech. Unpublished doctoral dissertation, Ohio State University, Columbus, OH.
Steinsapir, C., Forner, L., & Stemple, J. (1986, November). Voice characteristics among black and white children: Do differences exist? Paper presented at the annual convention of the American Speech-Language-Hearing Association, Detroit, MI.
Takahashi, H., & Koike, Y. (1975). Some perceptual dimensions and acoustical correlates of pathologic voices. Acta Otolaryngologica, suppl. 338, 1–24.
Tarone, E. (1972). Aspects of intonation in Black English. American Speech, 48, 29–36.
Till, J. A., Jafari, M., Crumley, R. L., & Law-Till, C. B. (1992). Effects of initial consonant, Pneumotachographic mask, and oral pressure tube on vocal perturbation, harmonics-to-noise, and intensity measurements. Journal of Voice, 6, 217–223.
Titze, I. R., Horii, Y., & Scherer, R. C. (1987). Some technical considerations in voice perturbation measurements. Journal of Speech and Hearing Research, 30, 252–260.
Wells, L. H., & Thomas, E. A. (1927). A note on two abnormal laryngeal muscles in a Zulu. Journal of Anatomy, 61, 340–343.
Wheat, M., & Hudson, A. (1988). Spontaneous speaking fundamental frequency of 6-year-old Black children. Journal of Speech and Hearing Research, 31, 723–725.
Wolfram, W., & Fasold, R. (1974). The study of social dialects in American English. Englewood Cliffs, NJ: Prentice-Hall.
Yumoto, E. (1983). The quantitative evaluation of hoarseness: A new harmonics to noise ratio method. Archives of Otolaryngology, 109, 48–52.
Yumoto, E. (1987). The quantitative assessment of the degree of hoarseness. Journal of Voice, 1, 310–313.
Yumoto, E., Gould, W. J., & Baer, T. (1982). Harmonics-to-noise ratio as an index of the degree of hoarseness. Journal of the Acoustical Society of America, 71, 1544–1550.
Yumoto, E., Sasaki, Y., & Okamura, H. (1984). Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness. Journal of Speech and Hearing Research, 27, 2–6.

Information & Authors

Information

Published In

Journal of Speech, Language, and Hearing Research
Volume 37Number 4August 1994
Pages: 738-745

History

  • Received: Apr 1, 1993
  • Accepted: Jan 4, 1994
  • Published in issue: Aug 1, 1994

Permissions

Request permissions for this article.

Keywords

  1. Jitter
  2. shimmer
  3. harmonics-to-noise ratio
  4. vocal quality
  5. voice perception

Authors

Affiliations

Julie H. Walton
University of Mississippi University
Robert F. Orlikoff
Memphis State University Memphis, TN

Notes

Contact author: Julie H. Walton, PhD, Department of Communicative Disorders, University of Mississippi, University, MS 38677.

Metrics & Citations

Metrics

Article Metrics
View all metrics



Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Citing Literature

View Options

Sign In Options

ASHA member? If so, log in with your ASHA website credentials for full access.

Member Login

View options

PDF

View PDF

Full Text

View Full Text

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share