Abstract
Purpose
The purpose of this study was to compare the utility of two automated indices of lexical diversity, the Moving-Average Type–Token Ratio (MATTR) and the Word Information Measure (WIM), in predicting aphasia diagnosis and responding to differences in severity and aphasia subtype.
Method
Transcripts of a single discourse task were analyzed for 478 speakers, 225 of whom had aphasia per an aphasia battery. We calculated the MATTR and the WIM for each participant. We compared the group means among speakers with aphasia, neurotypical controls, and left-hemisphere stroke survivors with mild aphasia not detected by an aphasia battery. We examined whether each measure distinguished levels of aphasia severity and subtypes of aphasia. We used each measure to classify aphasia versus neurotypical control and compared the areas under the curve.
Results
The WIM and the MATTR differentiated among people with aphasia, neurotypical controls, and people with mild aphasia. Both measures demonstrated moderately high predictive accuracy in classifying aphasia. The WIM demonstrated greater sensitivity to aphasia severity and subtype compared to the MATTR.
Conclusions
The WIM and the MATTR are promising measures that quantify lexical diversity in different and complementary ways. The WIM may be more useful for quantifying the effect of treatment or disease progression, whereas the MATTR may be more useful for discriminating discourse produced by people with very mild aphasia from discourse produced by neurotypical controls. Further validation is required.

References
-
Armstrong, E. (1991). The potential of cohesion analysis in the analysis and treatment of aphasic discourse.Clinical Linguistics & Phonetics, 5(1), 39–51. https://doi.org/10.3109/02699209108985501 -
Armstrong, E. (2018). The challenges of consensus and validity in establishing core outcome sets.Aphasiology, 32(4), 465–468. https://doi.org/10.1080/02687038.2017.1398804 -
Ash, S., Moore, P., Antani, S., McCawley, G., Work, M., & Grossman, M. (2006). Trying to tell a tale: Discourse impairments in progressive aphasia and frontotemporal dementia.Neurology, 66(9), 1405–1413. https://doi.org/10.1212/01.wnl.0000210435.72614.38 -
Bryant, L., Ferguson, A., & Spencer, E. (2016). Linguistic analysis of discourse in aphasia: A review of the literature.Clinical Linguistics & Phonetics, 30(7), 489–518. https://doi.org/10.3109/02699206.2016.1145740 -
Bryant, L., Spencer, E., & Ferguson, A. (2017). Clinical use of linguistic discourse analysis for the assessment of language in aphasia.Aphasiology, 31(10), 1105–1126. https://doi.org/10.1080/02687038.2016.1239013 -
Bunker, L. D., Wright, S., & Wambaugh, J. L. (2018). Language changes following combined aphasia and apraxia of speech treatment.American Journal of Speech-Language Pathology, 27(1S), 323–335. https://doi.org/10.1044/2018_AJSLP-16-0193 -
Cavanaugh, R., & Haley, K. L. (2020). Subjective communication difficulties of very mild aphasia: Survey of aphasia assessment measures implemented in clinical and research settings.American Journal of Speech-Language Pathology, 29(1S), 437–448. https://doi.org/10.1044/2019_AJSLP-CAC48-18-0222 -
Covington, M. A., He, C., Brown, C., Naçi, L., McClain, J. T., Fjordbak, B. S., Semple, J., & Brown, J. (2005). Schizophrenia and the structure of language: The linguist's view.Schizophrenia Research, 77(1), 85–98. https://doi.org/10.1016/j.schres.2005.01.016 -
Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian Knot: The Moving-Average Type–Token Ratio (MATTR).Journal of Quantitative Linguistics, 17(2), 94–100. https://doi.org/10.1080/09296171003643098 -
Crary, M. A., & Gonzalez Rothi, L. J. (1989). Predicting the Western Aphasia Battery Aphasia Quotient.Journal of Speech and Hearing Disorders, 54(2), 163–166. https://doi.org/10.1044/jshd.5402.163 -
Davidson, B., Howe, T., Worrall, L., Hickson, L., & Togher, L. (2008). Social participation for older people with aphasia: The impact of communication disability on friendships.Topics in Stroke Rehabilitation, 15(4), 325–340. https://doi.org/10.1310/tsr1504-325 -
DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach.Biometrics, 44(3), 837–845. https://doi.org/10.2307/2531595 -
Dietz, A., & Boyle, M. (2018). Discourse measurement in aphasia research: Have we reached the tipping point?.Aphasiology, 32(4), 459–464. https://doi.org/10.1080/02687038.2017.1398803 -
Elbourn, E., Kenny, B., Power, E., Honan, C., McDonald, S., Tate, R., Holland, A., MacWhinney, B., & Togher, L. (2019). Discourse recovery after severe traumatic brain injury: Exploring the first year.Brain Injury, 33(2), 143–159. https://doi.org/10.1080/02699052.2018.1539246 -
Fergadiotis, G., Wright, H. H., & Green, S. B. (2015). Psychometric evaluation of lexical diversity indices: Assessing length effects.Journal of Speech, Language, and Hearing Research, 58(3), 840–852. https://doi.org/10.1044/2015_JSLHR-L-14-0280 -
Fergadiotis, G., Wright, H. H., & West, T. M. (2013). Measuring lexical diversity in narrative discourse of people with aphasia.American Journal of Speech-Language Pathology, 22(2), S397–S408. https://doi.org/10.1044/1058-0360(2013/12-0083) -
Fraser, K. C., Hirst, G., Graham, N. L., Meltzer, J. A., Black, S. E., & Rochon, E. (2014). Comparison of different feature sets for identification of variants in progressive aphasia.InP. Resnik, R. Resnik & M. Mitchell (Eds.), Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality (pp. 17–26). Association for Computational Linguistics. https://doi.org/10.3115/v1/W14-3203 -
Fraser, K. C., Meltzer, J. A., & Rudzicz, F. (2016). Linguistic features identify Alzheimer's disease in narrative speech.Journal of Alzheimer's Disease, 49(2), 407–422. https://doi.org/10.3233/JAD-150520 -
Fraser, K., Rudzicz, F., Graham, N., & Rochon, E. (2013). Automatic speech recognition in the diagnosis of primary progressive aphasia.InJ. Alexandersson, P. Ljunglöf, K. F. McCoyF. Portet, B. Roark, F. Rudzicz, & M. Vacher (Eds.), Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies (pp. 47–54). Association for Computational Linguistics. -
Fromm, D., Forbes, M., Holland, A., Dalton, S. G., Richardson, J., & MacWhinney, B. (2017). Discourse characteristics in aphasia beyond the western aphasia battery cutoff.American Journal of Speech-Language Pathology, 26(3), 762–768. https://doi.org/10.1044/2016_AJSLP-16-0071 -
Heaps, H. S. (1978). Information retrieval, computational and theoretical aspects. Academic Press. -
Jacks, A., Haley, K. L., Bishop, G., & Harmon, T. G. (2019). Automated speech recognition in adult stroke survivors: Comparing human and computer transcriptions.Folia Phoniatrica et Logopaedica, 71(5–6), 282–292. https://doi.org/10.1159/000499156 -
Johansson, V. (2009). Lexical diversity and lexical density in speech and writing: A developmental perspective.Working papers in linguistics, 53, 61–79. -
Kagan, A., Simmons-Mackie, N., Rowland, A., Huijbregts, M., Shumway, E., McEwen, S., Threats, T., & Sharp, S. (2008). Counting what counts: A framework for capturing real-life outcomes of aphasia intervention.Aphasiology, 22(3), 258–280. https://doi.org/10.1080/02687030701282595 -
Kertesz, A. (2006). Western Aphasia Battery–Revised. Pro-Ed. -
Kirk, R. E. (1996). Practical significance: A concept whose time has come.Educational and Psychological Measurement, 56(5), 746–759. https://doi.org/10.1177/0013164496056005002 -
Kurt, I., Ture, M., & Kurum, A. T. (2008). Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease.Expert Systems with Applications, 34(1), 366–374. https://doi.org/10.1016/j.eswa.2006.09.004 -
Laliberté, M. P., Alary Gauvreau, C., & Le Dorze, G. (2016). A pilot study on how speech-language pathologists include social participation in aphasia rehabilitation.Aphasiology, 30(10), 1117–1133. https://doi.org/10.1080/02687038.2015.1100708 -
MacWhinney, B. (2000). The CHILDES Project: Tools for analyzing talk (third edition): Volume I: Transcription format and programs, Volume II: The database.Computational Linguistics, 26(4), 657. https://doi.org/10.1162/coli.2000.26.4.657 -
MacWhinney, B., Fromm, D., Forbes, M., & Holland, A. (2011). AphasiaBank: Methods for studying discourse.Aphasiology, 25(11), 1286–1307. https://doi.org/10.1080/02687038.2011.589893 -
MacWhinney, B., Fromm, D., Holland, A., Forbes, M., & Wright, H. (2010). Automated analysis of the Cinderella story.Aphasiology, 24(6–8), 856–868. https://doi.org/10.1080/02687030903452632 -
Maddy, K. M., Howell, D. M., & Capilouto, G. J. (2015). Current practices regarding discourse analysis and treatment following non-aphasic brain injury: A qualitative study.Journal of Interactional Research in Communication Disorders, 6(2), 211–236. https://doi.org/10.1558/jircd.v7i1.25519 -
Malyutina, S., Richardson, J. D., & den Ouden, D. B. (2016). Verb argument structure in narrative speech: Mining aphasiabank.Seminars in Speech and Language, 37(1), 34–47. https://doi.org/10.1055/s-0036-1572383 -
Masrani, V., Murray, G., Field, T., & Carenini, G. (2017). Detecting dementia through retrospective analysis of routine blog posts by bloggers with dementia.InK. B. Cohen, D. Demner-Fushman, S. Ananiadou, & J. Tsujii (Eds.), BioNLP 2017 (pp. 232–237). Association for Computational Linguistics. -
McCarthy, P. M., & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation.Language Testing, 24(4), 459–488. https://doi.org/10.1177/0265532207080767 -
McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment.Behavior Research Methods, 42(2), 381–392. https://doi.org/10.3758/BRM.42.2.381 -
Michalke, M. E. (2019). koRpus text analysis. Retrieved from https://ripley.psycho.hhu.de/koRpus/ -
Michalke, M. E., Brown, E., Mirisola, A., & Brulet, A. (2018). koRpus: An R package for text analysis(0.11.5) [Computer software] . https://rdrr.io/cran/koRpus/ -
Mueller, K. D., Koscik, R. L., Turkstra, L. S., Riedeman, S. K., LaRue, A., Clark, L. R., Hermann, B., Sager, M. A., & Johnson, S. C. (2016). Connected language in late middle-aged adults at risk for Alzheimer's disease.Journal of Disease, 54(4), 1539–1550. https://doi.org/10.3233/JAD-160252 -
Parish-Morris, J., Sariyanidi, E., Zampella, C., Bartley, G. K., Ferguson, E., Pallathra, A. A., Bateman, L., Plate, S., Cola, M., Pandey, J., Brodkin, E. S., Schultz, R. T., & Tunç, B. (2018). Oral-Motor and lexical diversity during naturalistic conversations in adults with autism spectrum disorder.InK. Loveys, K. Niederhoffer, E. Prud'hommeaux, R. Resnik, & P. Resnik (Eds.), Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic (pp. 147–157). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-0616 -
Park, H., Rogalski, Y., Rodriguez, A. D., Zlatar, Z., Benjamin, M., Harnish, S., Bennett, J., Rosenbek, J. C., Crosson, B., & Reilly, J. (2011). Perceptual cues used by listeners to discriminate fluent from nonfluent narrative discourse.Aphasiology, 25(9), 998–1015. https://doi.org/10.1080/02687038.2011.570770 - PlanetCalc. (2019). Calculator for Shannon entropy. Retrieved from https://planetcalc.com/2476/
-
Pritchard, M., Hilari, K., Cocks, N., & Dipper, L. (2018). Psychometric properties of discourse measures in aphasia: Acceptability, reliability, and validity.International Journal of Language & Communication Disorders, 53(6), 1078–1093. https://doi.org/10.1111/1460-6984.12420 -
Rinker, T. W. (2013). qdap: Quantitative discourse analysis package. University at Buffalo/SUNY. -
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., & Müller, M. (2011). pROC: An open-source package for R and S+ to analyze and compare ROC curves.BMC Bioinformatics, 12(77). https://doi.org/10.1186/1471-2105-12-77 -
Seshan, V. E., Gönen, M., & Begg, C. B. (2013). Comparing ROC curves derived from regression models.Statistics in Medicine, 32(9), 1483–1493. https://doi.org/10.1002/sim.5648 -
Shannon, C. E. (1948). A mathematical theory of communication.The Bell System Technical Journal, 27(3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x -
Simmons-Mackie, N., Threats, T. T., & Kagan, A. (2005). Outcome assessment in aphasia: A survey.Journal of Communication Disorders, 38(1), 1–27. https://doi.org/10.1016/j.jcomdis.2004.03.007 -
Stark, B. C. (2019). A comparison of three discourse elicitation methods in aphasia and age-matched adults: Implications for language assessment and outcome.American Journal of Speech-Language Pathology, 28(3), 1067–1083. https://doi.org/10.1044/2019_AJSLP-18-0265 -
Templin, M. C. (1957). Certain language skills in children; their development and interrelationships. University of Minnesota Press. https://doi.org/10.5749/j.ctttv2st -
Themistocleous, C., Webster, K., Ficek, B., & Tsapkini, K . (2019). Quantification of PPA effects on part-of-speech using computational grammars.Platform presentation, The 49th Clinical Aphasiology Conference, May 30, 2019, Whitefish, MT . -
Verna, A., Davidson, B., & Rose, T. (2009). Speech-language pathology services for people with aphasia: A survey of current practice in Australia.International Journal of Speech-Language Pathology, 11(3), 191–205. https://doi.org/10.1080/17549500902726059 -
Wallace, S. J., Worrall, L., Rose, T., & Le Dorze, G. (2016). Core outcomes in aphasia treatment research: An e-Delphi consensus study of international aphasia researchers.American Journal of Speech-Language Pathology, 25(4S), S729–S742. https://doi.org/10.1044/2016_AJSLP-15-0150 -
Wilson, S. M., Henry, M. L., Besbris, M., Ogar, J. M., Dronkers, N. F., Jarrold, W., Miller, B. L., & Gorno-Tempini, M. L. (2010). Connected speech production in three variants of primary progressive aphasia.Brain: A Journal of Neurology, 133(Pt. 7), 2069–2088. https://doi.org/10.1093/brain/awq129 -
Worrall, L., Sherratt, S., Rogers, P., Howe, T., Hersh, D., Ferguson, A., & Davidson, B. (2011). What people with aphasia want: Their goals according to the ICF.Aphasiology, 25(3), 309–322. https://doi.org/10.1080/02687038.2010.508530 -
Wright, H. H., & Capilouto, G. J. (2009). Manipulating task instructions to change narrative discourse performance.Aphasiology, 23(10), 1295–1308. https://doi.org/10.1080/02687030902826844 -
Youden, W. J. (1950). Index for rating diagnostic tests.Cancer, 3(1), 32–35. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3