No AccessJournal of Speech, Language, and Hearing ResearchResearch Article8 Jun 2022

Validity of Off-the-Shelf Automatic Speech Recognition for Assessing Speech Intelligibility and Speech Severity in Speakers With Amyotrophic Lateral Sclerosis

    Purpose:

    There is increasing interest in using automatic speech recognition (ASR) systems to evaluate impairment severity or speech intelligibility in speakers with dysarthria. We assessed the clinical validity of one currently available off-the-shelf (OTS) ASR system (i.e., a Google Cloud ASR API) for indexing sentence-level speech intelligibility and impairment severity in individuals with amyotrophic lateral sclerosis (ALS), and we provided guidance for potential users of such systems in research and clinic.

    Method:

    Using speech samples collected from 52 individuals with ALS and 20 healthy control speakers, we compared word recognition rate (WRR) from the commercially available Google Cloud ASR API (Machine WRR) to clinician-provided judgments of impairment severity, as well as sentence intelligibility (Human WRR). We assessed the internal reliability of Machine and Human WRR by comparing the standard deviation of WRR across sentences to the minimally detectable change (MDC), a clinical benchmark that indicates whether results are within measurement error. We also evaluated Machine and Human WRR diagnostic accuracy for classifying speakers into clinically established categories.

    Results:

    Human WRR achieved better accuracy than Machine WRR when indexing speech severity, and, although related, Human and Machine WRR were not strongly correlated. When the speech signal was mixed with noise (noise-augmented ASR) to reduce a ceiling effect, Machine WRR performance improved. Internal reliability metrics were worse for Machine than Human WRR, particularly for typical and mildly impaired severity groups, although sentence length significantly impacted both Machine and Human WRRs.

    Conclusions:

    Results indicated that the OTS ASR system was inadequate for early detection of speech impairment and grading overall speech severity. While Machine and Human WRR were correlated, ASR should not be used as a one-to-one proxy for transcription speech intelligibility or clinician severity ratings. Overall, findings suggested that the tested OTS ASR system, Google Cloud ASR, has limited utility for grading clinical speech impairment in speakers with ALS.

    References

    Additional Resources