Free access
Research Article
8 January 2020

The Classification Accuracy of a Dynamic Assessment of Inferential Word Learning for Bilingual English/Spanish-Speaking School-Age Children

Publication: Language, Speech, and Hearing Services in Schools
Volume 51, Number 1
Pages 144-164

Abstract

Purpose

Educators often use results from static norm-referenced vocabulary assessments to aid in the diagnosis of school-age children with a language disorder. However, research has indicated that many of these vocabulary assessments yield inaccurate, biased results, especially with culturally and linguistically diverse children. This study examined whether the use of a dynamic assessment of inferential word learning was more accurate at identifying bilingual (English/Spanish-speaking) children with a language disorder when compared to static measures of vocabulary.

Method

Thirty-one bilingual Spanish/English school-age children—21 children with typical language and 10 children with a language disorder—ages 5;9–9;7 (years;months) were administered traditional static vocabulary assessments and a dynamic assessment of inferential word learning that used a test–teach–test design.

Results

Discriminant analysis and logistic regression indicated that the combined posttest scores and modifiability ratings from the dynamic assessment generated 90%–100% sensitivity and 90.5%–95.2% specificity, which were superior to the static vocabulary tests.

Conclusion

These preliminary findings suggest that dynamic assessment of inferential word learning may be an effective method for accurately identifying diverse children with a language disorder.
The primary purpose of a norm-referenced vocabulary test is to determine whether an individual's score falls above, within, or below the normal range. A score below the normal range often leads to the interpretation of a disorder. Yet, results of norm-referenced vocabulary tests are often difficult to interpret because the vocabulary that a child has in his or her repertoire is highly dependent on exposure to words used in the child's environment (e.g., Hart & Risley, 1992). The socioeconomic status of families, cultural and dialectal variations in language, familial values, and societal norms are strongly and significantly correlated with the number and types of words children produce (Hart & Risley, 1992; Hoff, 2003; Roseberry-McKibbin, 2002). Thus, a child may do poorly on a vocabulary test because of limited exposure to test items as opposed to any within-child factor such as a language disorder.
Relatedly, the majority of words that children learn are acquired through an inferential word-learning process (Hughes & Chinn, 1986; van Kleeck, 2008). Inferential word learning refers to the acquisition of new vocabulary using cues such as contextual, morphosyntactic, and morphological referents (Cain, Oakhill, & Elbro, 2003; Carnine, Kameenui, & Coyle, 1984; Ford-Connors, & Paratore, 2015; Larsen & Nippold, 2007; Nagy & Scott, 2000; Wolter & Pike, 2015). Children with a language disorder have difficulty learning language, and this includes difficulty learning to infer the meaning of words (Kan & Windsor, 2010). Norm-referenced static vocabulary tests usually do not measure inferential word learning and therefore do not assess the very construct that would indicate a disorder.
Being able to infer the meaning of unfamiliar words becomes increasingly important as children progress through school (Adams, 2010; Beck & McKeown, 1991; Paul, 2007; Westby & Torres-Velasquez, 2000). Gathercole, Willis, Emslie, and Baddeley (1992) found that exposure to print was strongly correlated with vocabulary acquisition and growth and that children with larger vocabularies had greater reading comprehension success, with the gap between good and poor vocabulary learners growing wider each year (see also Graves, 1986). At 5 years of age, a typically developing English-speaking child has a vocabulary of 4,000–5,000 words (Nation & Waring, 1997), and once formal schooling begins, children learn approximately 2,000–4,000 words each year (Baumann & Kameenui, 1991; Beck & McKeown, 1991; Gleason & Ratner, 2009; Graves, 1986). Although a child is exposed to thousands of new words each year without receiving direct instruction on the meaning of those words, oral and written language often include contextual referents that provide cues to facilitate word learning (Adams, 2010; Cummins, 1994; Hiebert & Kamil, 2005; Paul, 2007; van Kleeck, 2008; van Kleeck, Gilliam, Hamilton, & McGrath, 1997). Children who are better at learning vocabulary from context experience greater success with academic and discipline-specific language than children who struggle with inferential word learning (Nagy & Townsend, 2012).
Although the ability to infer the meaning of words plays a critical role in vocabulary acquisition, norm-referenced and other static vocabulary assessments tend to focus on measuring the words a child already knows. Therefore, these tests do not assess the mechanism whereby children acquire the majority of their vocabulary. Documenting the depth and breadth of a child's current vocabulary is important, yet interpreting the results of such vocabulary tests for the purpose of identifying a disorder can be severely confounded by factors such as prior exposure to the words included in the test, low socioeconomic status, and English language learning status. When interpreting the results from a static vocabulary test, there will always be questions as to whether a child missed a certain vocabulary test item because of a true language disorder or because that child had not been exposed to that word sufficiently. This limited confidence in a test's results leads to poor validity. Even though the limited diagnostic accuracy of norm-referenced vocabulary tests has been well documented (Gray, Plante, Vance, & Henrichsen, 1999; Restrepo, 1998; Stockman, 2000; Washington & Craig, 1999), the majority of speech-language pathologists (SLPs) appear to use these tests to identify children with a language disorder. Caesar and Kohler (2009) found that SLPs working with elementary school–age children most commonly used the Peabody Picture Vocabulary Test (e.g., PPVT-4; Dunn & Dunn, 2007) and the Expressive One-Word Picture Vocabulary Test (e.g., EOWPVT-R; Gardner, 1990) to assess language. In a nationwide survey of U.S.-based SLPs, S. K. Betz, Eickhoff, and Sullivan (2013) found that four of the 10 most frequently used tests to identify a language disorder were static, single-word, norm-referenced vocabulary tests. Furthermore, although a number of researchers have identified the number of different words (NDW) in language samples as markers of language disorders in monolingual (Watkins, Kelly, Harbers, & Hollis, 1995) and bilingual (Eisenberg & Guo, 2013; Gutiérrez-Clellen, Restrepo, Bedore, Peña, & Anderson, 2000; Rojas & Iglesias, 2009) children, experiential confounds persist. Even when conceptual scoring is allowed across languages when measuring a bilingual child's vocabulary, there still exists that possibility that the words assessed in either language are unfamiliar to the child because of limited exposure or other extraneous factors and not because of a language disorder. Such conceptual scoring has not yielded adequate sensitivity and specificity (Anaya, Peña, & Bedore, 2018).
Instead of only identifying a child's current state of vocabulary knowledge, identifying difficulty with inferential word learning may help educators differentiate between a language difference and a language disorder (Gray, 2004). Difficulty with inferential word learning is a marker of language learning difficulty or language disorder. Children with a language disorder have smaller vocabularies (Westby, 1985) and have considerable difficulty learning new words (Adams, 2010; Alt & Spaulding, 2011; Beck & McKeown, 1991; Dollaghan, 1987; Gray, 2004; Rice, Buhr, & Nemeth, 1990; Rice, Oetting, Marquis, Bode, & Pae, 1994). Studies have documented that a child's capacity to process, store, and retrieve linguistic and nonlinguistic information about new words is foundational to the ability to establish meaning (Ellis Weismer & Hesketh, 1993; Leonard, Davis, & Deevy, 2007). These studies indicate that word learning requires a child to process the phonological information contained in new words, to attend to the relevant linguistic and nonlinguistic cues in context, to associate the meaning of the word with the phonological form of the word, to remember the form–meaning association, and to retrieve and use the word appropriately (Alt, Plante, & Creusere, 2004; Ellis Weismer & Evans, 2002). This processing account of vocabulary acquisition suggests that difficulty with inferential word learning is related to poor processing skills (McGregor, Newman, Reilly, & Capone, 2002). Thus, it is expected that children with a language disorder will have relatively greater difficulty inferring the meaning of novel words when those words are embedded in a context that taxes their language processing, such as narrative or expository discourse (Westby, 1985).
The selection of word type in addition to context may be important when attempting to measure the ability of children with and without a language disorder to learn to infer the meaning of words from context. The meaning of novel verbs is generally more difficult to deduce from context than the meaning of novel nouns (Froud & van der Lely, 2008). Thus, teaching and assessing the ability to infer the meaning of novel verbs would increase the probability that children with and without a language disorder would have something to learn, which would establish a context wherein learning could be observed and measured. Although children with a language disorder can infer the meaning of verbs (Oetting, 1999), especially when syntactic cues surrounding verbs are present and there is verb morphology to facilitate inferential word learning (Gleitman & Gleitman, 1992), children with a language disorder do not use morphosyntactic cues as effectively as typically developing children (Rice, Cleave, & Oetting, 2000). There is evidence to suggest, then, that novel verbs with morphosyntactic cues embedded in narrative discourse would establish a context wherein children with and without a language disorder would, and could, have something to learn. This context would also facilitate differentiation between children with and without a language disorder based on the greater difficulty children with a language disorder would have in learning to infer the meaning of the novel verbs.
Novel word-learning tasks have been of clinical interest because they have been able to differentiate children with a language disorder from typically developing peers to some degree of accuracy (Alt et al., 2004; Baker, Simmons, & Kameenui, 1998; Ellis Weismer & Evans, 2002; Ellis Weismer & Hesketh, 1993). Several studies have reported the relatively rapid rate at which children learn vocabulary through limited exposure via adult mediation (Carey, 1978; Dickinson, 1984; Dockrell & Campbell, 1986; Dollaghan, 1985; Oetting, Rice, & Swank, 1995; Wells, 1986). These studies have shown that typically developing children learn words in and out of context after only a few exposures, yet it is considerably more difficult for children with a language disorder to do so, especially when the novel words are abstract verbs (Kan & Windsor, 2010; Rice et al., 1994).

Static Norm-Referenced Vocabulary and Language Assessments

Existing norm-referenced measures of vocabulary very often provide highly inaccurate scores, providing insufficient information about a child's ability to learn vocabulary. For example, Gray et al. (1999) examined the diagnostic accuracy of four norm-referenced static vocabulary tests. These researchers administered the PPVT-III (Dunn & Dunn, 1997), the Receptive One-Word Vocabulary Test (Gardner, 1985), the Expressive Vocabulary Test (Williams, 1997), and the Expressive One-Word Vocabulary Test–Revised (Gardner, 1990) to 62 monolingual, English-speaking preschool students identified with and without a language disorder. Gray et al. found that about 50% of their participants were incorrectly identified as having typical language or language disorder by at least one of the four norm-referenced vocabulary tests administered. The results of their study indicated that sensitivity ranged from 71% to 77% and specificity ranged from 68% to 71%. Spaulding, Plante, and Farinella (2006) found in their review of 43 norm-referenced static language assessments that tests of vocabulary were among the group of tests with the poorest classification accuracy. Spaulding et al. found that many children with language impairment fell within the normal range on standardized, norm-referenced vocabulary tests such as the Comprehensive Receptive and Expressive Vocabulary Test–Second Edition (Wallace & Hammill, 2002), the Expressive Vocabulary Test–Second Edition (Williams, 2007), and the PPVT-IV (Dunn & Dunn 2007). Conversely, many children with typical language scored more than 1 SD below the mean on these vocabulary tests, which further illustrates the questionable validity of the information obtained from such static, norm-referenced vocabulary measures.
Children who are English language learners will often have limited English language vocabulary, yet such a vocabulary difference does not imply that there is a disorder. Because diversity in the United States is increasing, inaccurate diagnoses of language disorder will continue to be widespread for diverse populations until more valid measures can be identified. Although the 2010 census indicated the majority of the population in the United States identified as non-Hispanic White, individual state and county growth shows that minority groups are becoming more prevalent, and many of these groups speak a language other than English (U.S. Census Bureau, 2011). According to the U.S. Census Bureau, 20.6% of the population speaks a language other than English at home, with 12.8% of that population speaking Spanish (U.S. Census Bureau, 2010). The projected population increase by 2050 shows the non-Hispanic White population decreasing and the Hispanic and Asian populations increasing, with one in five citizens being foreign born (Pew Research Center Publications, 2008). The need to provide valid, linguistically unbiased vocabulary assessment for culturally and linguistically diverse children will become more and more pressing as diversity increases. An assessment approach that is less biased against culturally and linguistic populations is severely needed. If vocabulary instruments measure vocabulary learning independent of prior word exposure, it might be possible to reduce or eliminate experiential differences as a source of bias.

Dynamic Assessment

Dynamic assessment is partially derived from Vygotsky's sociocultural theory (Vygotsky, 1978), which postulates that social interaction and the environment serve as the medium through which cognition develops (Lantolf & Poehner, 2004). Dynamic assessment is also founded on the idea that children learn through mediated learning experiences with scaffolding provided to aid in development (Feuerstein, Falik, & Feuerstein, 1998). During mediated learning experiences, children receive support from others who have stronger skills. As the child's cognitive ability to attend, recall, and multitask strengthens, he or she can begin using skills independently (Gutiérrez-Clellen & Peña, 2001). In terms of language, Vygotsky's sociocultural theory invokes recursive, circular reasoning, where the extent to which a child learns language through explicit and implicit social interaction is related to that child's ability to learn language. In other words, a child with a language disorder will have difficulty learning language. Dynamic assessment is designed to measure this very construct—the ability to learn.
Dynamic assessment often uses a test–teach–retest approach to measure a child's ability to learn. The most unique feature of the dynamic assessment is the teaching process, during which the examiner instructs the child, gaining information about learning ability or modifiability (Gutiérrez-Clellen & Peña, 2001; Peña, 2000; Ukrainetz, 2005). When using a test–teach–retest approach, a pretest is administered to assess a child's existing level of knowledge. Then, a teaching phase is initiated, during which the examiner teaches the child a skill that he or she could not adequately perform at pretest. The child's capacity to learn is then assessed using learning checklists (Lidz, 1991) or modifiability Likert scales (Lidz, 1987). The ease or difficulty in instructing the child during the teaching phase has been highly indicative of learning ability (Peña et al., 2006; Peña, Gillam, & Bedore, 2014; Petersen, Chanthongthip, Ukrainetz, Spencer, & Steeve, 2017). Following the teaching phase, a posttest is administered. Cultural and linguistic biases present in static norm-referenced tests are greatly reduced by using dynamic assessment because the focus is on learning ability, not current knowledge.

Dynamic Assessment of Vocabulary

Studies that have focused on using dynamic assessment of vocabulary to differentiate children with and without a language disorder (or weakness) have had generally positive results. Peña, Quinn, and Iglesias (1992) and, in a follow-up study, Peña, Iglesias, and Lidz (2001) used dynamic assessment procedures to examine vocabulary in diverse preschool children. The teaching phases focused on helping the students learn labeling strategies. Pretest and posttest scores and modifiability rating scales aided in differentiating between typically developing children and children with a language disorder.
Ukrainetz, Harpell, Walsh, and Coyle (2000); Camilleri and Law (2007); and Kapantzoglou, Restrepo, and Thompson (2011) also found that dynamic assessments focused on vocabulary can differentiate between children with stronger or weaker language skills with fair to good accuracy. Ukrainetz et al. assessed 23 Native American kindergarten children using a dynamic assessment that taught vocabulary categorization. Fifteen of the children had stronger language skills, and eight had weaker language skills. Classification accuracy was 87% sensitivity and 100% specificity. Camilleri and Law examined a dynamic assessment of receptive and expressive vocabulary with 14 preschoolers who were typically developing and 40 preschool students referred for language intervention services. Their teaching phase used a graduated prompting procedure to help children match unfamiliar words with pictures and to generate the meaning of the word. This graduated prompting approach systematically increased examiner support until the child was able to respond correctly. Results indicated that the children referred for language services scored significantly lower than the typically developing children. Moderately high classification accuracy was also found by Kapantzoglou et al., who assessed twenty-eight 4- and 5-year-old predominantly Spanish-speaking children using a Spanish dynamic assessment of word learning. The 30- to 40-min teaching phase of the dynamic assessment focused on labeling items using nonsense words. Receptive word-learning measures and modifiability measures yielded the highest classification accuracy, with 76.9% sensitivity and 80% specificity.
Researchers have also reported positive results when investigating other evidences of validity beyond classification accuracy for the dynamic assessment of vocabulary. Camilleri and Botting (2013) found that their dynamic assessment of receptive vocabulary had good evidence of concurrent and predictive validity. It has also been reported that dynamic assessments of morphological awareness can identify a wide range of skill levels and inform instruction (Larsen & Nippold, 2007; Ram, Marinellie, Benigno, & McCarthy, 2013; Wolter & Pike, 2015).
Research on the dynamic assessment of vocabulary thus far has indicated that children with typical language are more responsive to instruction during the teaching phase of a dynamic assessment than children who have a language disorder and that posttest scores and modifiability ratings yield the highest classification accuracy. Sensitivity and specificity of dynamic assessments focused on word learning have ranged from fair to excellent, with superior results over static measures of vocabulary. Although previous research results appear promising, no dynamic assessment studies to our knowledge have attempted to teach children to infer the meaning of words using contextual inference. Vocabulary assessments measuring the construct of inference, the means by which many words are learned (Hughes & Chinn, 1986; van Kleeck, 2008), may be of great value for both classification and clinical purposes because of the construct alignment between what needs to be assessed and what is being assessed. Therefore, the purpose of this early-stage study was to examine the evidence of validity of a dynamic assessment of inferential word learning and to compare those results to traditional, static measures of vocabulary. We hypothesized that a dynamic assessment that measured the construct of word learning would more accurately differentiate children with and without a language disorder. Specifically, our research question was as follows: How well does a dynamic assessment of inferential word learning accurately classify bilingual English/Spanish-speaking children with and without a language disorder when compared to static measures of receptive and expressive vocabulary?

Method

Participants

This research project received approval from the authors' respective institutional review boards. Participants were recruited from a large urban school district in the western part of the United States. Principals and SLPs of three elementary schools were contacted and asked to recruit all children from kindergarten to third grade who were Hispanic with at least minimal proficiency in Spanish and English. The parents of 84 children signed permission forms providing consent for their children to participate in the study. All of the children were identified as being bilingual to some degree by the principals, teachers, and SLPs, with verification using English and Spanish language samples collected by the researchers. All of the children were able to retell narratives in both English and Spanish. Of the 84 children, 67 had no identification of a language disorder and 17 had an Individualized Education Program (IEP) for language services. The 84 school-age children were randomly assigned to participate in the current study or in another study being conducted by the investigators at the time (Petersen et al., 2017). From the pool of 67 children with typical language development and 17 children with a language disorder, 21 children with typical language and 10 children with a language disorder were randomly selected to participate in the current study. Participants were between the ages of 5;9 and 9;7 (years;months), with a mean of 7;9.
The participants' parents completed demographic questionnaires that gathered information on the children's prior preschool attendance, eligibility for free or reduced lunch, language use and exposure in the home, and mother's highest level of education. Complete information was available for all participants except for mother's level of education and preschool attendance. Table 1 summarizes the results of the questionnaire for each of the completed items. All of the parents indicated that Spanish was spoken in the home at least 1 hr a day in the child's presence. Parents also confirmed that their child spoke at least some English and some Spanish. Language sample analyses of English and Spanish narrative retells were used to confirm bilingual status. If English and Spanish performance on any one of mean length of utterance, total number of words (TNW), and NDW was within 1 z score of each other, then a child was classified as balanced bilingual. Children were from predominantly low socioeconomic households, with 97% qualifying for free or reduced lunch and 88% with maternal education less than high school.
Table 1. Participant demographic information.
CharacteristicsTotal
N = 31
LD
n = 10
TD
n = 21
No. (%)No. (%)No. (%)
Kindergarten5 (16)1 (10)4 (19)
First grade2 (6)1 (10)1 (5)
Second grade13 (42)2 (20)11 (52)
Third grade11 (35)6 (60)5 (24)
Female13 (42)1 (10)12 (57)
Mother's education < high school21 (88)5 (71)13 (93)
Attended preschool10 (56)4 (80)6 (46)
Free or reduced lunch30 (97)9 (90)21 (100)
Note. LD = language disorder; TD = typical language development.

Initial Identification of Language Ability

In order to examine the classification accuracy of the dynamic assessment, children needed to be identified a priori as having a language disorder or typical language. To receive a classification of a language disorder, children had to meet all four of the following requirements. First, a child had to be currently receiving language services in school, based on an IEP. Second, a bilingual, native Spanish-speaking SLP had to have played a key role in the eligibility decision. Third, a child had to perform more than 1 SD below the mean on at least one of three indicators in both languages: mean length of utterance in words (MLUw) in English and Spanish, TNW in English and Spanish, and NDW in English and Spanish when retelling a model story based on the wordless storybook Frog, Where Are You? (Frog Retell; Mayer, 1969; Miller & Iglesias, 2008). Finally, written or verbal confirmation of language impairment status was required from at least one parent or teacher with no parent or teacher disagreeing with the judgment, providing evidence of a functional language disorder. A native Spanish-speaking SLP was also involved in the eligibility decision for children classified as having typically developing language. These children could not have an IEP for language services; they had to have scores better than −1 SD of the mean on MLUw, TNW, and NDW on the Frog Retell; and parents and teachers could not have concerns about the child's language.

Procedure

Assessment Overview

The research activities for each participant were conducted for a brief amount of time over 2 consecutive days. On the first day, the Frog Retells and the static vocabulary assessments using the Expressive and Receptive One-Word Picture Vocabulary Tests–Spanish Bilingual Edition (E/ROWPVT-SBE; Martin & Brownell, 2012a, 2012b) were administered in random order. The following day, the dynamic assessment was administered. Trained examiners, blind to the language status of the participants, administered the Frog Retell, the E/ROWPVT-SBE, and the dynamic assessment, which was administered in English only. Spanish and bilingual measures were administered by bilingual graduate and undergraduate research assistants who spoke Spanish as a second language. The English-only dynamic assessment and the English Frog Retell were administered by either English-only or bilingual research assistants. The Frog Retell took less than 10 min to administer, and the E/ROWPVT took less than 30 min to administer. The dynamic assessment took approximately 15 min to complete. The Frog Retell, static vocabulary, and dynamic assessment sessions were audio recorded.

Frog Retell Language Sample

A narrative language sample was collected in English and Spanish using the wordless storybook Frog, Where Are You? (Mayer, 1969). Administration of English and Spanish Frog Retells was randomized across participants. To elicit the Frog Retell, examiners showed the child pictures from the wordless picture book and read a script in English or Spanish from the Systematic Analysis of Language Transcripts (SALT) manual (Miller & Iglesias, 2012). After the examiner modeled the story for the child, the wordless picture book was given to the child and the child was asked to retell the story in that same language. Each retell sample was audio recorded. Children had a brief break from storytelling for 5–10 min and then moved to a different testing location for administration of the Frog Retell in the target language not yet sampled.

Static Vocabulary Measures

Raw scores from the EOWPVT-SBE and raw scores from the ROWPVT-SBE were used to represent receptive and expressive static vocabulary. The ROWPVT-SBE is a bilingual receptive vocabulary assessment that uses picture stimuli. Each test item includes four pictures on a single test plate. The examiner says the target word in the child's dominant language as determined by a pretest questionnaire, and the child is then prompted to point to or verbally indicate which picture represents that target word. Code switching between English and Spanish is permitted during the test. The EOWPVT-SBE is an expressive vocabulary test. The examiner presents a single picture to the child and then the examiner asks the child in English or Spanish, “What is this?” or “Qué es esto?”
Audio recordings of English and Spanish narrative retells were transcribed, segmented, and analyzed by trained bilingual research assistants using the SALT software (Miller, & Iglesias, 2012). SALT-derived MLUw, NDW, and TNW were compared to bilingual, age-matched peer data (± 6 months) in the SALT Bilingual Spanish/English Story Retell Reference Database. This database contains Spanish and English retells of Frog, Where Are You? from over 2,000 bilingual kindergarten through third-grade students. These comparisons provided information on language ability. Additionally, each child's highest z score from the NDW produced in the English or Spanish Frog, Where Are You? language sample was used to represent static vocabulary ability. NDW analyses in English excluded Spanish words, and NDW in Spanish excluded English words, per SALT conventions.

Dynamic Assessment

The dynamic assessment used a hybrid pretest–teaching–posttest and graduated prompting procedure. It was composed of (a) a pretest that assessed inferential word learning in a narrative context, (b) a teaching phase that explicitly taught how to use context clues to understand the meaning of a word, (c) a modifiability rating scale that was completed by the examiner, and (d) a posttest of inferential word learning.
Dynamic assessment pretests and posttests. The dynamic assessment had a pretest that was composed of three different sections: (a) two narrative-based pretests that measured a child's ability to infer the meaning of nonsense words embedded in narration, (b) two picture-based pretests that measured a child's ability to point to pictures that represented those nonsense words, and, if necessary, (c) two supplemental sentence-based pretests that measured a child's ability to infer the meaning of different nonsense words embedded in a sentence with limited contextual support. Children were administered all of the narrative- and picture-based pretests. If a child scored 100% on those two pretests, then they were administered the two sentence-based pretests that were designed to be more difficult than the narrative- and picture-based pretests. These more difficult, supplemental sentence-based pretests were included to motivate the need for the teaching phase. Out of the 31 participants, six children (two with a language disorder and four with typical language development) were administered the sentence-based pretests, and only one of those children (a child with typical language) was able to understand even one of the two nonsense words.
Children were asked to listen to a short narrative that had a single nonsense word (a verb) embedded two times within the story (see Appendix A). After listening to the narrative read by the examiner, children were asked to define the nonsense word. The nonsense verb was written in regular past tense in one section of the story (e.g., punuped ) and written in the infinitive form in another section of the story (e.g., to punup). Nonsense verbs with clearly marked tense and with the infinitive “to” marker were used to help children infer meaning through the use of morphosyntactic markers. Research has indicated that even children with a language disorder can learn unfamiliar verbs using morphosyntactic cues (Oetting, 1999), but this becomes more difficult for children with a language disorder when language demands are increased by embedding nonsense words in narrative discourse (Ko & Hwang, 2008).
For each narrative-based pretest, the examiner said, “I'm going to tell you a short story. Please listen carefully. There is a new word in this story. When I'm done, I'm going to ask you about the new word.” The examiner then read the story to the child word for word at a moderate pace. After reading the narrative-based pretest, the examiner said to the child, “What does [nonsense word] mean?” The examiner was directed to wait 5–10 s before providing the prompt “It's OK. You can guess.” The child's response to each of the two narrative-based pretests was scored using a 0–2 scale, with 2 indicating the child provided a synonym, a clear and complete definition, or an example with a definition and 1 indicating that the child provided an incomplete or unclear definition. A score of 0 indicated that the child provided an incorrect response. A total of 4 points were possible from the two narrative-based pretests (2 points from each narrative). Acceptable responses for each stimulus were provided to the examiner to ensure reliability of scoring (see Appendix B).
A picture-based pretest was always administered immediately after each narrative-based pretest. The picture-based pretest further assessed a child's comprehension of the nonsense words that were in the preceding narrative. The picture-based pretest required a nonverbal response, where a child was asked to point to one of four pictures that best represented the nonsense word's meaning. The examiner presented four pictures to the child and then said, “Point to [nonsense word].” The examiner was directed to wait 5–10 s before saying “It's OK. You can guess.” The child received a score of 1 by pointing to the correct picture and a score of 0 for pointing to an incorrect picture. A total of 2 points were possible from the two picture-based pretests (1 point from each assessment).
Responses from the two narrative-based pretests were reviewed by the examiner after the administration of the last picture-based pretest. If a child received a score lower than 2 on either of the narrative-based pretests (< 100% correct), then the teaching phase of the dynamic assessment was immediately administered. If a child received 2 points on both narrative-based pretests (100% correct), then the more challenging sentence-based pretests were administered to further examine the child's ability to infer the meaning of words in a more difficult context and to establish the need for the teaching phase of the dynamic assessment. There were two sentence-based pretests. Each sentence-based pretest used a different nonsense word that played an adjectival role embedded in a sentence with limited contextual cues (e.g., “Miguel got the kamarin score on the test. He was really sad.”). This sentence with only one nonsense adjective was provided to offer fewer contextual cues than were available in the previous pretest procedures, increasing the difficulty for those children who had ceiling scores on the narrative-based pretest. Although young typically developing children learn to use syntactic cues that indicate a word is an adjective (Waxman & Markow, 1998), these cues do not necessarily provide sufficient information about the meaning of the adjective (e.g., He wants the ____ [big… small… red… flat] ball). Thus, adjectives often require contextual cues to facilitate inference, and when those cues are limited, the task can become quite difficult. Thus, in creating this more difficult pretest task for our dynamic assessment, we severely limited the cues available around the nonsense adjectives embedded in the sentences, including avoiding the use of affixes (e.g., biggest). During this sentence-based pretest, the examiner said, “I'm going to read you some sentences two times. Please listen carefully. I can only read them two times. There is a new word in them. When I'm done, I'm going to ask you about the new word.” The examiner read the sentences word for word at a moderate pace two times before asking the child to define the target nonsense word. The examiner said, “What does [nonsense word] mean?” The child was awarded 0, 1, or 2 points following the scoring guidelines used in the narrative-based pretest. If a child was administered the supplemental sentence-based pretest, both sentence-based pretests were always administered.
Following the pretests, children were then administered the teaching phase of the dynamic assessment and the posttests. The posttests followed the same administration and scoring procedures as the pretests yet used different narratives, pictures, and sentences with different nonsense words.
The total possible score on the pretest was 10 points, and the total possible score on the posttest was 10 points: (a) 2 points for the correct definition of the nonsense word in the first narrative, (b) 2 points for the correct definition of the nonsense word in the second narrative, (c) 1 point for pointing to the correct picture after the first narrative, (d) 1 point for pointing to the correct picture after the second narrative, (e) 2 points for the correct definition of the nonsense word in the first supplemental sentence, and (f) 2 points for the correct definition of the nonsense word in the second supplemental sentence.
Dynamic assessment teaching phases. Two dynamic assessment teaching phases were administered to all of the participants immediately after the pretests. In each teaching phase, the examiner read a narrative with a new nonsense word (a verb) embedded twice in that narrative (e.g., dalated, to dalat). After reading the narrative, the examiner taught the child strategies to infer the meaning of the nonsense word. Each of the two teaching phases was divided into two lessons where the examiner loosely followed a script that highlighted how to infer the meaning of the nonsense word from “clues” around that word (see Appendix C). Teaching principles used to guide the lessons were founded on the Think-Aloud Problem Solving strategy (Lochhead, 2001; Whimbey & Lochhead, 1986). The focus of instruction was to first have the examiner model think-aloud strategies and to then have the child emulate those strategies verbally. The Think-Aloud Problem Solving procedures entailed the following five steps: (a) identify the problem, (b) collect information (i.e., think about it), (c) plan a strategy (i.e., plan to replace words), (d) carry out the plan (i.e., replace the words), and (e) look back (i.e., check for meaning). Duration of both teaching phases ranged from 5 to 20 min, based largely on the modifiability of the child.
Children were tested on their ability to understand the meaning of the nonsense word after the first lesson and then again after the second lesson. Children who could not infer the meaning of the nonsense word after the second lesson were also asked an open-ended question. This question was designed to provide the child with the maximal opportunity to successfully infer the meaning of the word (e.g., “If you play soccer and knock a picture down, what did you do to the ball?”). Finally, the examiner replaced the nonsense word with the real word to check for meaning. Children were always administered both teaching phases and were always taught the first and second lessons within each of those phases, regardless of performance. Responses were scored using the same scoring guidelines used in the narrative-based pretest and posttest.
Dynamic assessment modifiability. Scores (0, 1, or 2) recorded after the second lesson in Teaching Phase 1 and after the second lesson in Teaching Phase 2 were used as indicators of modifiability. These two modifiability scores were referred to as the child's graduated prompting modifiability scores. An additional measure of modifiability was recorded immediately after the completion of both teaching phases of the dynamic assessment, before the administration of the posttest. To obtain this postteaching modifiability data, the examiner completed a seven-item modifiability scale to assess the child's overall responsiveness to instruction (see Appendix D). This modifiability rating scale was similar to the one used in Petersen et al. (2017). The first six questions of the modifiability rating form required the examiner to reflect on how frequently or clearly specific child behaviors occurred during the teaching phase: (a) responsiveness to prompts, (b) displaying transfer of learning from one teaching lesson to the next, (c) attending to the teaching, (d) ease of teaching, (e) level of frustration, and (f) disrupting the testing session. Each item was rated on a 3-point scale (0–2). Question 7 asked about the child's potential to learn to infer the meaning of unfamiliar words from context, or overall modifiability, with 0 points indicating considerable difficulty learning how to infer word meanings, 1 point indicating some difficulty, and 2 points indicating very little difficulty. Two postteaching modifiability scores were calculated: (a) a total modifiability index (TMI) of 14 possible points for the summed responses for Questions 1–7 and (b) an overall modifiability rating from Question 7 alone (Mod-7) with 0, 1, or 2 points possible. The Mod-7 scale was dichotomized post hoc to allow for the binary classification of a language disorder/no language disorder. Examiners were blind to the cutoff scores used to create this dichotomous classification.

Fidelity and Interrater Reliability

Training for administration and scoring of the Frog Retell, E/ROWPVT-SBE, and dynamic assessment was conducted under the supervision of a bilingual certified SLP (first author). Prior to administering any tests to participants, 10 undergraduate and graduate student examiners were required to perform administration and scoring procedures across five trials with 95% accuracy. Using a fidelity checklist, fidelity of administration was observed and measured for a randomly selected subset of 25% of all test administrations. Point-to-point fidelity of administration for all measures was 96% (range: 90%–100%).
Interrater reliability was examined in real time for 20% of the administrations of the E/ROWPVT-SBE by the first author. Interrater reliability for the E/ROWPVT-SBE was 97% (range: 89%–100%). Interrater reliability for the audio-recorded measures, Frog, Where Are You?, and the dynamic assessment was conducted by five independent, trained undergraduate students. Thirty percent of the Frog Retell samples were independently transcribed, segmented into C-units, and scored. Interrater reliability for transcription was 93% (range: 87%–100%) and 91% for C-unit segmentation (range: 85%–100%). Thirty percent of the dynamic assessment recordings were also randomly selected and listened to. Each random selection was scored and then compared to real-time scoring results. Point-to-point examiner agreement was 89% (range: 79%–97%) for pretest and posttest scoring. Examiner agreement of the Mod-7 final modifiability judgment was 89% (range: 77%–100%), and interrater reliability for the TMI was 86% (range: 75%–93%). Examiner agreement on a dichotomized score of the Mod-7 scale, with 0 indicating disorder, was 100%.

Pre-Analysis Data Inspections

Predictive linear discriminant function analysis was used in this early-stage study using SPSS 25.0. Because discriminant analysis is sensitive to the ratio of sample size and the number of predictor variables, many predictor variables were eliminated from the dynamic assessment. Based on prior results of dynamic assessment research, we concluded that pretest scores and pretest-to-posttest gain scores would not be predictive of language ability (Peña et al., 2014, 2006; Petersen et al., 2017; Ukrainetz et al., 2000). Therefore, we only used the dynamic assessment modifiability and posttest scores in the discriminant analysis. Discriminant analysis also requires that certain assumptions be met, although it is fairly robust to violations of these assumptions apart from the presence of outliers (Tabachnick & Fidell, 2013). For example, we examined whether both groups of children (those with and without a language disorder) had scores on the predictor measures (i.e., static vocabulary measures and dynamic assessment modifiability and posttest scores) that were approximately normally distributed. The Shapiro–Wilk test indicated that none of the predictors, except for NDW and the EOWPVT, was normally distributed. This was somewhat expected given the limited range of scores and resulting limited variance in most of the dynamic assessment variables. Skewness and kurtosis, however, were not ± 3.0, indicating that the predictor variables were at least minimally normally distributed. The relationship between all pairs of predictors in each group was fairly linear, per visual inspection. The assumption of no multicollinearity was examined using a Pearson product–moment correlation matrix. Not surprisingly, the dynamic assessment TMI modifiability score and the subsection of that score, the Mod-7 score, were correlated at .87. We chose to use the Mod-7 modifiability score in the discriminant analysis due to the stronger point-to-point and dichotomous interrater reliability of this variable. There were no outliers among the predictor variables. Participants performed at the ceiling on the posttest receptive tests (picture pointing); thus, those posttest variables were not included in the discriminant analysis.

Analytic Approach

Linear discriminant function analysis was conducted to determine how well each individual static measure and how well the combination of dynamic assessment variables predicted language ability. Specifically, we independently analyzed the raw scores from the ROWPVT-SBE and EOWPVT-SBE as well as the NDW from the Frog language samples. From the dynamic assessment, we examined the combination of the graduated prompting modifiability results, the Mod-7 modifiability rating, and the narrative- and sentence-based posttest scores. In addition, we conducted a logistic regression analysis with the same predictors used in the discriminant analysis. This secondary analysis was conducted to provide confirmatory evidence of the findings derived from the discriminant analysis. Discriminant analysis can be a preferred method when predictor variables are normally distributed; however, logistic regression does not make assumptions of normality and can therefore offer an alternative analysis of the data (Press & Wilson, 1979). Finally, we examined the data to provide clinically translatable results. Because the sample size was relatively small, we were able to identify the optimal cut scores through visual analysis and through the development of a flow chart (see Appendix E). This decision-making flow chart was applied separately to the children with a language disorder and to the children without a language disorder to highlight the way in which high percentages of sensitivity and specificity were obtained using the same variables that were included in the discriminant analysis.

Results

Table 2 displays the mean raw scores and standard deviations for the static assessment and dynamic assessment vocabulary predictor variables. Table 3 details the results of the discriminant analyses for the individual variables and for the combination of dynamic assessment variables. In Table 3, the cutoff scores represent the score that best differentiates between the children with a language disorder and the children who have typical language. Overall classification represents how well the measure separates the language disorder and typically developing groups. In this study, sensitivity is the ability of the measure to correctly identify children with a language disorder (i.e., true positive), whereas specificity reflects the measure's ability to correctly identify children without a language disorder (i.e., true negative). The canonical correlation is the magnitude of the association between the discriminant function and the dependent variable, in this case, group (language disorder and typical language development).
Table 2. Mean raw scores (standard deviations) of predictor measures.
Predictor measuresLD (n = 10)TD (n = 21)
Static vocabulary predictor measures
 EOWPVT-SBE26.30 (8.64)41.00 (21.09)
 ROWPVT-SBE52.10 (30.59)68.38 (35.51)
 Frog Retell NDW81.80 (25.62)91.10 (22.29)
Dynamic assessment predictor variables
 Mod-7 modifiability score0.70 (0.95)1.52 (0.60)
 Graduated Prompting Modifiability 10.90 (0.99)1.71 (0.56)
 Graduated Prompting Modifiability 21.10 (0.99)1.33 (0.86)
 Posttest total score3.30 (1.77)4.05 (2.44)
Note. Mod-7 modifiability score is the examiner's overall judgment of a child's ability to learn words through inference. Graduated Prompting Modifiability 1 and Graduate Prompting Modifiability 2 are modifiability scores derived after the second lesson in Teaching Phase 1 and after the second lesson in Teaching Phase 2. Posttest total score is the sum of the narrative-based Posttests 1 and 2 and sentence-based Posttests 1 and 2. LD = language disorder; TD = typical language development; EOWPVT-SBE = Expressive One-Word Picture Vocabulary Test–Spanish Bilingual Edition; ROWPVT-SBE = Receptive One-Word Picture Vocabulary Test–Spanish Bilingual Edition; Frog Retell NDW = number of different words best score from English/Spanish Frog, Where Are You? language samples.
Table 3. Results from linear discriminant function analyses.
VariableCutoff scoreOverall classificationSensitivitySpecificityWilks's lambdaχ2Canonical correlationp
Frog Retell NDW12171.010.0100.0.961.03.19.31
EOWPVT-SBE19.564.510.090.5.874.06.36.04
ROWPVT-SBE34.567.70.0100.0.951.48.23.22
DA Mod-7183.960.095.2.777.47.48< .01
DA Graduated Prompting 1180.650.095.2.777.39.48< .01
DA Graduated Prompting 2167.70.0100.0.990.44.12.51
DA posttest total377.450.095.5.6711.01.58.03
All DA variables combineda 90.390.090.5.4222.09.76< .01
Note. Mod-7 is the examiner's overall judgment of a child's ability to learn words through inference. Graduated Prompting 1 and Graduated Prompting 2 are modifiability scores derived after the second lesson in Teaching Phase 1 and after the second lesson in Teaching Phase 2; Box's M assumption of homogeneity of covariance matrices was met with all predictor variables (p > .001). NDW = number of different words best score from English/Spanish Frog, Where Are You? language samples in the Frog Retell number of different words; EOWPVT-SBE = Expressive One-Word Picture Vocabulary Test–Spanish Bilingual Edition; ROWPVT-SBE = Receptive One-Word Picture Vocabulary Test–Spanish Bilingual Edition; DA = dynamic assessment of inferential word learning.
a
No interpretable cutoff score could be derived from the discriminant analysis for the combination of dynamic assessment variables.

Static Vocabulary Measures

NDW

The linear discriminant function analysis indicated that the overall Wilks's lambda was not significant for the Frog Retell NDW measure, Λ = .96, χ2(1) = 1.03, p = .31, indicating that the highest NDW produced from the English or Spanish Frog narrative retell was not significantly related to language ability. The discriminant analysis indicated that the overall classification was .71, with sensitivity of the NDW measure at 10% and specificity at 100%. Logistic regression results were also nonsignificant, χ2(1) = 1.10, p = .30, Nagelkerke R 2 = .05, with classification accuracy and sensitivity and specificity identical to those found in the discriminant analysis.

EOWPVT-SBE

The overall Wilks's lambda was significant for the EOWPVT-SBE, Λ = .87, χ2(1) = 4.06, p = .04, indicating the EOWPVT-SBE was related to language ability. Overall classification from the discriminant analysis was .65. Sensitivity of the EOWPVT-SBE was 10%, and specificity was 90.5%. The logistic regression was also significant, χ2(1) = 4.85, p < .05, Nagelkerke R 2 = .20, with .68 overall classification accuracy and 40% sensitivity and 81% specificity.

ROWPVT-SBE

The overall Wilks's lambda was not significant for the static ROWPVT-SBE, Λ = .95, χ2(1) = 1.48, p = .22, indicating that the ROWPVT-SBE was not related to language ability. From the discriminant analysis, overall classification was .68, with sensitivity of the ROWPVT-SBE at 0% and specificity at 100%. Logistic regression indicated a nonsignificant prediction, χ2(1) = 1.61, p = .21, Nagelkerke R 2 = .07, with .71 overall classification accuracy and 10% sensitivity and 100% specificity.

Dynamic Assessment

The overall Wilks's lambda was significant for each of the individual dynamic assessment variables (p ≤ .03) with the exception of the graduated prompting modifiability score from Teaching Phase 2, Λ = .99, χ2(1) = 0.44, p = .51. Overall classification for each individual dynamic assessment variable ranged from .68 to .84, with sensitivity ranging from 0% to 60% and specificity ranging from 95% to 100%. For all of the dynamic assessment variables combined, the overall Wilks's lambda was significant, Λ = .42, χ2(1) = 22.09, p < .01, indicating that the linear combination of all dynamic assessment variables (i.e., modifiability and posttest measures) was significantly related to language ability. Overall classification was .90 with 90% sensitivity and 90.5% specificity. To confirm the findings from the discriminant analysis for all dynamic assessment variables combined, we also conducted a logistic regression analysis. The results of the logistic regression were significant, χ2(7) = 35.17, p < .001, Nagelkerke R 2 = .95, with .97 overall classification accuracy and 100% sensitivity and 95.2% specificity.
We compared the proportion of children correctly classified by the dynamic assessment to the chance rate using a z test of proportions (N. E. Betz, 1987). We first calculated the chance rate for unequal group sizes as specified in Brown and Wicker (2000). We then used a z test of proportions to compare this chance proportion to the proportion correctly classified by the dynamic assessment (90%). Results indicated that the classification accuracy of the dynamic assessment was significantly different from what would be expected by chance, z = 1.60, p < .05.

Discussion

This study examined the classification accuracy of a dynamic assessment of inferential word learning and of static vocabulary measures for bilingual Spanish/English-speaking school-age children. The static vocabulary tests reflected traditional approaches often administered to bilingual school-age children. These static approaches provided information on each child's current English and Spanish receptive and expressive vocabulary. The dynamic assessment used a pretest–teach–posttest design that tested and taught nonsense words embedded within narratives. The focus of the dynamic assessment was to measure a child's ability to learn to infer the meaning of nonsense words from context. It was hypothesized that children with a language disorder would have lower modifiability ratings and lower posttest scores on the dynamic assessment than typically developing children and that the dynamic assessment would have stronger classification accuracy than the traditional static assessments. The results of this study confirmed our hypotheses. The dynamic assessment was more accurate at identifying children with and without a language disorder than the static vocabulary tests, even when those static tests were administered in English and Spanish. Results from both discriminant analysis and logistic regression converged to indicate that excellent classification accuracy can be achieved when inferential word-learning dynamic assessment posttest scores are combined with modifiability ratings. A visual inspection of the data, illustrated using a flow chart, provided specific cut scores for the combination of dynamic assessment variables. These clinically applicable cut scores resulted in sensitivity and specificity similar to what was identified using both statistical analyses.
Because the dynamic assessment used in this study measured the construct of inferential word learning, it was able to measure the very same mechanism children frequently use to learn thousands of new words each year (Hughes & Chinn, 1986; van Kleeck, 2008). This construct alignment, with a focus on learning potential, mitigated several confounds encountered when measuring the words a child has already acquired and better separated children with and without a language disorder.

Modifiability and Posttest Scores as Indicators of a Language Disorder

Prior research on dynamic assessment has indicated that a combination of modifiability ratings and posttest scores tend to yield the strongest sensitivity and specificity values (e.g., Kapantzoglou et al., 2011; Peña et al., 2006, 2001; Petersen et al., 2017; Ukrainetz et al., 2000). In alignment with this corpus of evidence, excellent sensitivity and specificity were obtained in this study when the dynamic assessment modifiability scores were combined with the posttest scores. It seems to be the case that a child with a language disorder will demonstrate limited response to explicit instruction, and this limited response can be noted even when instruction is provided over a very brief period of time. Conversely, children who have intact language learning ability appear to respond to such instruction in a manner that sets them apart from their peers who have a language disorder. This same pattern has been noted with posttest scores, where children with a language disorder tend to have relatively lower scores after the treatment phase of a dynamic assessment. This consistent finding has now been replicated across dynamic assessments that focus on word-level reading (e.g., Petersen, Gragg, & Spencer, 2018), narrative language (e.g., Peña et al., 2014, 2006; Petersen et al., 2017), and vocabulary (e.g., Kapantzoglou et al., 2011; Ukrainetz et al., 2000).
The evidence is mounting that response to instruction, or modifiability, is something that can be reliably detected by diverse examiners who have markedly different experiences and backgrounds in the administration and interpretation of language assessment. For example, in this particular study, the undergraduate and graduate research assistants had very little experience with language assessment, whereas the first and third authors had many years of experience with language assessment, yet all examiners agreed 100% of the time as to whether a child fell within the lower bounds or upper bounds of the modifiability rating scale. Although concerns may arise about the reliability of a modifiability rating scale that at face value appears to be highly subjective, this study, along with other studies, has reported the interrater reliability of such modifiability ratings, and results have been positive (e.g., Peña, Reséndiz, & Gillam, 2007; Petersen et al., 2017). Dynamic assessment modifiability scores when combined with posttest scores appear to be reliable and less culturally and linguistically biased measures of a child's learning potential. In addition, this learning potential appears to align tightly with the construct of language ability.
The preliminary findings from this study are particularly noteworthy when considered in the context of what was being assessed and the population the sample represented. The dynamic assessment administered in this study measured vocabulary, and the children who were assessed were all culturally and linguistically diverse. Based on prior analyses of language assessments administered to school-age children with and without a language disorder (e.g., Restrepo et al., 2006; Spaulding et al., 2006), it would be expected that such a combination would bear relatively low classification results. This was not the case for this study, and such findings, although preliminary, provide evidence that dynamic assessment can mitigate bias even under some of the most challenging circumstances.

Static Vocabulary Measures

The results of this study add to the extant research on vocabulary assessment and reinforce the general concern among researchers, psychometricians, and clinicians about the poor evidence of validity for static vocabulary measures (e.g., Gray et al., 1999; Restrepo et al., 2006; Spaulding et al., 2006). The case is building against using single time static measures to identify a language disorder, especially when the child being assessed is economically, culturally, or linguistically diverse. Although the results of the static E/ROWPVT vocabulary tests and NDW index from the current study indicated that children with a language disorder did worse on average than the children without a language disorder, there was so much overlap in the standard deviations that sensitivity and specificity were unacceptable. Indeed, mean differences in scores between children with and without a language disorder are reported in most static vocabulary assessment technical reports; nevertheless, standard deviations tell much more of the story and provide information on just how variant scores tend to be within those groups (Spaulding et al., 2006). The validity of a dynamic assessment of inferential word learning stems from its evident capacity to uncover a child's latent ability to learn new words from context. Children may perform poorly on a static vocabulary test for various reasons, whereas a dynamic assessment can provide an opportunity for the child to demonstrate his or her ability to learn to infer word meanings, potentially mitigating such extraneous factors.

Limitations and Future Directions

Further development of a dynamic assessment for vocabulary is necessary to determine the nature of deficits in word-learning processes. Replication and cross-validation of this study with a larger sample size are warranted and will provide more information on the extent to which dynamic assessment of word learning maintains classification accuracy among a different sample of children. Furthermore, this study only included kindergarten through third-grade students, with few students represented across those grades, posing questions about age-related differences in word learning and how processes in vocabulary acquisition may differ depending on task, such as exposure to low-frequency words encountered in educational curriculums. Conducting tasks that are leveled by word frequency will lend further information about the systems that impact vocabulary learning throughout the early years, including comprehension and attentional factors that impede the ability to acquire and retain vocabulary. It is unclear whether the dynamic assessment would yield different results by grade level or whether the dynamic assessment would maintain validity with younger or older students. A larger and more diverse sample size, with a greater representation for each grade level, would facilitate disaggregation of data with more detailed analyses.
It is also possible that a dynamic assessment approach in English only may underestimate some bilingual participants' abilities. The ability to infer word meaning from context is contingent upon knowledge of morphosyntactic cues and knowledge of the words surrounding the unknown word. Children may vary in their ability to use dynamic assessment inference techniques if they have less English knowledge to draw upon in order to make meaning inferences. Among the bilingual participants in this sample, some of the variability in response to the dynamic assessment may be related to differences in English, not to differences in overall language abilities. This could be especially problematic if clinicians employed dynamic assessment with bilingual children in their weaker language and consequently concluded that they have a language impairment when they would have performed differently when tested in their dominant language. However, in all but the most extreme cases, dynamic assessment should mitigate these confounds for the most part because the focus is on learning potential. In other words, even if a child has limited English language proficiency, the design of the dynamic assessment places the examiner's focus on the child's ability to learn from wherever they start at pretest. Only when a child has extremely limited English would the use of another language be necessary—or so it is hypothesized. This line of inquiry should be followed in subsequent studies.
The content of a dynamic assessment of word learning could also differ considerably. Children learn different words such as nouns and verbs with varying degrees of ease (Froud & van der Lely, 2008). The use of different word classes in a dynamic assessment of inferential word learning may provide better (or worse) classification accuracy for children with and without a language disorder. Furthermore, variance in the type of words used in the assessment could expand the comprehensive and functional nature of the dynamic assessment. For example, the inclusion of multiple word types could lead to the identification of specific vocabulary treatment targets.
Test confounds were also present in the current study, and further development of test items is necessary. Location of target nonsense words in the dynamic assessment stories may have affected a child's ability to focus on those target words. For example, the Teaching Phase 1 narrative contained a target word embedded in the story within the first sentence, while the presentation of the target word in the Posttest 2 narrative was embedded toward the end. Extraneous narrative information not directly related to inferring meaning of the target word may have impaired a child's ability to attend to the target word. Many participants responded poorly to the teaching phase narrative that contained target words near the beginning of the narrative. For example, many children incorrectly indicated that “dalat” meant “to break.” Their incorrect inference may have been due to the fact that the contextual information supporting “dalat” (which meant to kick) was embedded early on in the story, and information later in the story about how something was broken may have foiled their initial fast mapping. Although this confound did not affect posttest scores, judgment of a child's modifiability may be more accurate with test items that are leveled across all stimuli. Further inspection of proximity of target word presentation within the narrative, narrative context, and detail could aid in creating narratives and sentences that are sensitive to location of target words (Cain et al., 2003; Carnine et al., 1984).

Conclusion

This study provides preliminary support for use of a dynamic assessment of inferential word learning to identify bilingual Spanish/English-speaking children with and without a language disorder. The dynamic assessment procedures used in this study yielded 90% sensitivity and 90.5% specificity, which were superior to the classification results obtained from English and Spanish static receptive and expressive vocabulary assessments. Such findings are promising and important because of the increasingly diverse nature of the student population in the United States. Vocabulary is a key construct that plays an integral role in reading comprehension and writing, and having the tools to accurately identify vocabulary learning disorder is the first step in identification and remediation of such difficulties.

Acknowledgments

The authors wish to thank Ashlynn Stevens and Stormy Lupher, who provided meaningful edits to earlier article drafts.

References

Adams, M. J. (2010). Advancing our students' language and literacy: The challenge of complex texts. American Educator, 34(4), 3–11.
Alt, M., Plante, E., & Creusere, M. (2004). Semantic features in fast-mapping: Performance of preschoolers with specific language disorder versus preschoolers with normal language. Journal of Speech, Language, and Hearing Research, 47, 407–420.
Alt, M., & Spaulding, T. (2011). The effect of time on word learning: An examination of decay of the memory trace and vocal rehearsal in children with and without specific language impairment. Journal of Communication Disorders, 44(6), 640–654.
Anaya, J. B., Peña, E. D., & Bedore, L. M. (2018). Conceptual scoring and classification accuracy of vocabulary testing in bilingual children. Language, Speech, and Hearing Services in Schools, 49, 85–97.
Baker, S. K., Simmons, D. C., & Kameenui, E. J. (1998). Vocabulary acquisition: Research bases. In D. C. Simmons & E. J. Kameenui (Eds.), What reading research tells us about children with diverse learning needs: Bases and basics. Mahwah, NJ: Erlbaum.
Baumann, J. F., & Kameenui, E. J. (1991). Research on vocabulary instruction: Ode to Voltaire. In J. Flood, J. J. D. Lapp, & J. R. Squire (Eds.), Handbook of research on teaching the English language arts (pp. 604–632). New York, NY: MacMillan.
Beck, I., & McKeown, M. (1991). Conditions of vocabulary acquisition. In R. Barr, M. Kamil, P. Mosenthal, & P. D. Pearson (Eds.), Handbook of reading research (Vol. 2, pp. 789–814). New York, NY: Longman.
Betz, N. E. (1987). Use of discriminant analysis in counseling psychology research. Journal of Counseling Psychology, 34(4), 393–403.
Betz, S. K., Eickhoff, J. R., & Sullivan, S. F. (2013). Factors influencing the selection of standardized tests for the diagnosis of specific language impairment. Language, Speech, and Hearing Services in Schools, 44(2), 133–146.
Brown, M. T., & Wicker, L. R. (2000). Discriminant analysis. In H. E. A. Tinsley & S. D. Brown (Eds.), Handbook of applied multivariate statistics and mathematical modeling (pp. 209–235). San Diego, CA: Academic Press.
Caesar, L. G., & Kohler, P. D. (2009). Tools clinicians use: A survey of language assessment procedures used by school-based speech-language pathologists. Communication Disorders Quarterly, 30(4), 226–236.
Cain, K., Oakhill, J. V., & Elbro, C. (2003). The ability to learn new word meanings from context by school-age children with and without language comprehension difficulties. Journal of Child Language, 30, 681–694.
Camilleri, B., & Botting, N. (2013). Beyond static assessment of children's receptive vocabulary: A dynamic assessment of word learning ability. International Journal of Language & Communication Disorders, 48(5), 565–581.
Camilleri, B., & Law, J. (2007). Assessing children referred to speech and language therapy: Static and dynamic assessment of receptive vocabulary. International Journal of Speech-Language Pathology, 9, 312–322.
Carey, S. (1978). The child as word learner. In M. Halle, J. Bresman, & G. Miller (Eds.), Linguistic theory and psychological reality (pp. 265–293). Cambridge, MA: MIT Press.
Carnine, D., Kameenui, E. J., & Coyle, G. (1984). Utilization of contextual information in determining the meaning of unfamiliar words. Reading Research Quarterly, 19, 188–204.
Cummins, J. (1994). The acquisition of English as a second language. In K. Spangenberg-Urbschat & R. Pritchard (Eds.), Kids come in all languages: Reading instruction for ESL students (pp. 36–62). Newark, DE: International Reading Association.
Dickinson, D. K. (1984). First impressions: Children's knowledge of words gained from a single experience. Applied Psycholinguistics, 5, 359–373.
Dockrell, J. E., & Campbell, R. N. (1986). Lexical acquisition strategies. In S. Kuczaj & M. Barrett (Eds.), Semantic development. New York, NY: Springer Verlag.
Dollaghan, C. (1985). Child meets word: “Fast mapping” in preschool children. Journal of Speech and Hearing Research, 28(3), 449–454.
Dollaghan, C. A. (1987). Fast mapping in normal and language-impaired children. Journal of Speech and Hearing Disorders, 52, 218–222.
Dunn, L. M., & Dunn, L. M. (1997). Peabody Picture Vocabulary Test–Third Edition (PPVT-III). Circle Pines, MN: AGS.
Dunn, L. M., & Dunn, D. M. (2007). Peabody Picture Vocabulary Test–Fourth Edition (PPVT-IV). Minneapolis, MN: NCS Pearson.
Eisenberg, S. L., & Guo, L. Y. (2013). Differentiating children with and without language impairment based on grammaticality. Language, Speech, and Hearing Services in Schools, 44, 20–31.
Ellis Weismer, S. E., & Evans, J. L. (2002). The role of processing limitations in early identification of specific language disorder. Topics in Language Disorders, 22(3), 15–29.
Ellis Weismer, S. E., & Hesketh, L. J. (1993). The influence of prosodic and gestural cues on novel word acquisition by children with specific language disorder. Journal of Speech and Hearing Research, 36, 1013–1025.
Feuerstein, R., Falik, L., & Feuerstein, R. (1998). Feuerstein's LPAD. In R. Samuda (Ed.), Advances in cross-cultural assessment. Thousand Oaks, CA: Sage.
Ford-Connors, E., & Paratore, J. R. (2015). Vocabulary instruction in fifth grade and beyond: Sources of word learning and productive contexts for development. Review of Educational Research, 85(1), 50–91.
Froud, K., & van der Lely, H. K. (2008). The count–mass distinction in typically developing and grammatically specifically language impaired children: New evidence on the role of syntax and semantics. Journal of Communication Disorders, 41, 274–303.
Gardner, M. (1985). Receptive One-Word Picture Vocabulary Test. Novato, CA: Academic Therapy Publications.
Gardner, M. (1990). Expressive One-Word Picture Vocabulary Test–Revised: Manual and forms (EOWPVT-R). Novato, CA: Academic Therapy Publications.
Gathercole, S. E., Willis, C. S., Emslie, H., & Baddeley, A. (1992). Phonological memory and vocabulary development during the early school years: A longitudinal study. Developmental Psychology, 28(5), 887–898.
Gleason, J. B., & Ratner, N. B. (2009). The development of language (7th ed.). Boston, MA: Pearson.
Gleitman, L. R., & Gleitman, H. (1992). A picture is worth a thousand words, but that's the problem: The role of syntax in vocabulary acquisition. Current Directions in Psychological Science, 1(1), 31–35.
Graves, M. F. (1986). Vocabulary learning and instruction. Review of Research in Education, 13, 49–89.
Gray, S. (2004). Word learning by preschoolers with specific language impairment: Predictors and poor learners. Journal of Speech, Language, and Hearing Research, 47, 1117–1132.
Gray, S., Plante, E., Vance, R., & Henrichsen, M. (1999). The diagnostic accuracy of four vocabulary tests administered to preschool-age children. Language, Speech, and Hearing Services in Schools, 30, 196–206.
Gutiérrez-Clellen, V. F., & Peña, E. (2001). Dynamic assessment of diverse children: A tutorial. Language, Speech, and Hearing Services in Schools, 32, 212–224.
Gutiérrez-Clellen, V. F., Restrepo, M. A., Bedore, L. M., Peña, E. D., & Anderson, R. T. (2000). Language sample analysis in Spanish-speaking children: Methodological considerations. Language, Speech, and Hearing Services in Schools, 31(1), 88–98.
Hart, B., & Risley, R. T. (1992). American parenting of language-learning children: Persisting difference in family–child interactions observed in natural home environments. Developmental Psychology, 28(6), 1096–1105.
Hiebert, E. H., & Kamil, M. L. (2005). Teaching and learning vocabulary: Bringing research to practice. Mahwah, NJ: Erlbaum.
Hoff, E. (2003). The specificity of environmental influence: Socioeconomic status affects early vocabulary development via maternal speech. Child Development, 74, 1368–1378.
Hughes, G., & Chinn, C. (1986). Building reading vocabulary through inference: A better classification of context clues. In B. Snyder, W. H. Bartz, & J. B. Goepper (Eds.), Second language acquisition: Preparing for tomorrow (pp. 93–108). Lincolnwood, IL: National Textbook Company.
Kan, P. F., & Windsor, J. (2010). Word learning in children with primary language impairment: A meta-analysis. Journal of Speech, Language, and Hearing Research, 53(3), 739–756.
Kapantzoglou, M., Restrepo, M. A., & Thompson, M. (2011). Dynamic assessment of word learning skills: Identifying language disorder in bilingual children. Language, Speech, and Hearing Services in Schools, 43, 81–96.
Ko, S., & Hwang, M. (2008). The effects of discourse length on inference abilities of children with specific language impairment. Communication Sciences Disorders, 12(1), 86–102.
Lantolf, J. P., & Poehner, M. E. (2004). Dynamic assessment: Bringing the past into the future. Journal of Applied Linguistics, 1, 49–74.
Larsen, J. A., & Nippold, M. A. (2007). Morphological analysis in school-age children: Dynamic assessment of a word learning strategy. Language, Speech, and Hearing Services in Schools, 38, 201–212.
Leonard, L. B., Davis, J., & Deevy, P. (2007). Phonotactic probability and past tense use by children with specific language impairment and their typically developing peers. Clinical Linguistics & Phonetics, 21, 747–758.
Lidz, C. S. (1987). Dynamic assessment: An interactional approach to evaluating learning potential. New York, NY: Guilford.
Lidz, C. S. (1991). Practitioner's guide to dynamic assessment. New York, NY: Guilford.
Lochhead, J. (2001). Think back: A user's guide to minding the mind. Mahwah, NJ: Erlbaum.
Martin, N., & Brownell, R. (2012a). Expressive One-Word Picture Vocabulary Test–Fourth Edition: Spanish (EOWPVT-IV: Spanish) [Measurement instrument] . Austin, TX: Pro-Ed.
Martin, N., & Brownell, R. (2012b). Receptive One-Word Picture Vocabulary Test–Fourth Edition: Spanish (ROWPVT-IV: Spanish) [Measurement instrument] . Austin, TX: Pro-Ed.
Mayer, M. (1969). Frog, where are you? New York, NY: Dial Press.
McGregor, K. K., Newman, R. M., Reilly, R. M., & Capone, N. C. (2002). Semantic representation and naming in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 45, 998–1014.
Miller, J. F., & Iglesias, A. (2008). Systematic analysis of language transcripts (SALT, Bilingual SE Version 2008) [Computer software] . Madison, WI: SALT Software.
Miller, J. F., & Iglesias, A. (2012). Systematic analysis of language transcripts (SALT, Research Version 2012). Middleton, WI: SALT Software.
Nagy, W. E., & Scott, J. A. (2000). Vocabulary processes. In M. Kamil, P. Mosenthal, P. D. Pearson, & R. Barr (Eds.), Handbook of reading research (Vol. 3, pp. 269–284). Mahwah, NJ: Erlbaum.
Nagy, W. E., & Townsend, D. (2012). Words as tools: Learning academic vocabulary as language acquisition. Reading Research Quarterly, 47(1), 91–108.
Nation, P., & Waring, R. (1997). Vocabulary size, text coverage and word lists. In N. Schmitt & M. McCarthy (Eds.), Vocabulary: Description, acquisition and pedagogy (pp. 6–19). Cambridge, United Kingdom: Cambridge University Press.
Oetting, J. B. (1999). Children with SLI use argument structure cues to learn verbs. Journal of Speech, Language, and Hearing Research, 42, 1261–1274.
Oetting, J. B., Rice, M. L., & Swank, L. K. (1995). Quick incidental learning (QUIL) of words by school-age children with and without SLI. Journal of Speech and Hearing Research, 38, 434–445.
Paul, R. (2007). Language disorders from infancy through adolescence: Assessment and intervention (3rd ed.). St. Louis, MO: Mosby.
Peña, E. D. (2000). Measurement of modifiability in children from culturally and linguistically diverse backgrounds. Communication Disorders Quarterly, 21(2), 87–97.
Peña, E. D., Gillam, R. B., & Bedore, L. M. (2014). Dynamic assessment of narrative ability in English accurately identifies language disorder in English language learners. Journal of Speech, Language, and Hearing Research, 57(6), 2208–2220.
Peña, E. D., Gillam, R. B., Malek, M., Ruiz-Felter, R., Resendiz, M., Fiestas, C., & Sabel, T. (2006). Dynamic assessment of school-age children's narrative ability: An experimental investigation of classification accuracy. Journal of Speech, Language, and Hearing Research, 49, 1037–1057.
Peña, E. D., Iglesias, A., & Lidz, C. S. (2001). Reducing test bias through dynamic assessment of children's word learning ability. American Journal of Speech-Language Pathology, 10, 138–154.
Peña, E. D., Quinn, R., & Iglesias, A. (1992). The application of dynamic methods to language assessment: A nonbiased procedure. Journal of Special Education, 26, 269–280.
Peña, E. D., Reséndiz, M., & Gillam, R. B. (2007). The role of clinical judgments of modifiability in the diagnosis of language disorder. Advances in Speech-Language Pathology, 9, 332–345.
Petersen, D. B., Chanthongthip, H., Ukrainetz, T. A., Spencer, T. D., & Steeve, R. W. (2017). Dynamic assessment of narratives: Efficient, accurate identification of language disorder in bilingual students. Journal of Speech, Language, and Hearing Research, 60, 983–998.
Petersen, D. B., Gragg, S. L., & Spencer, T. D. (2018). Predicting reading problems 6 years into the future: Dynamic assessment reduces bias and increases classification accuracy. Language, Speech, and Hearing Services in Schools, 49(4), 875–888.
Pew Research Center Publications. (2008). Latinos account for half of U.S. population growth since 2000. Retrieved from http://pewresearch.org/pubs/1002/latino-population-growth
Press, S. J., & Wilson, S. (1979). Choosing between logistic regression and discriminant analysis. Santa Monica, CA: RAND.
Ram, G., Marinellie, S. A., Benigno, J., & McCarthy, J. (2013). Morphological analysis in context versus isolation: Use of a dynamic task with school-age children. Language, Speech, and Hearing Services in Schools, 44, 32–47.
Restrepo, M. A. (1998). Identifiers of predominantly Spanish-speaking children with language impairment. Journal of Speech, Language, and Hearing Research, 41, 1398–1411.
Restrepo, M. A., Schwanenflugel, P. J., Blake, J., Neuharth-Pritchett, S., Cramer, S. E., & Ruston, H. P. (2006). Performance on the PPVT-III and the EVT: Applicability of the measures with African-American and European American preschool children. Language, Speech, and Hearing Services in Schools, 37(1), 17–27.
Rice, M. L., Buhr, J. C., & Nemeth, M. (1990). Fast mapping word learning abilities of language delayed preschoolers. Journal of Speech and Hearing Disorders, 55, 33–42.
Rice, M. L., Cleave, P. L., & Oetting, J. B. (2000). The use of syntactic cues in lexical acquisition by children with specific language impairment. Journal of Speech, Language, and Hearing Research, 43, 582–594.
Rice, M. L., Oetting, J. B., Marquis, J., Bode, J., & Pae, S. (1994). Frequency of input effects on SLI children's word comprehension. Journal of Speech and Hearing Research, 37, 106–122.
Rojas, R., & Iglesias, A. (2009). Making a case for language sampling: Assessment and intervention with (Spanish–English) second language learners. The ASHA Leader, 14(3), 10–13.
Roseberry-McKibbin, C. (2002). Multicultural students with special language needs: Practical strategies for assessment and intervention (2nd ed.). Oceanside, CA: Academic Communication Associates.
Spaulding, T. J., Plante, E., & Farinella, K. A. (2006). Eligibility criteria for language disorder: Is the low end of normal always appropriate. Language, Speech, and Hearing Services in Schools, 37, 61–72.
Stockman, I. J. (2000). The new Peabody Picture Vocabulary Test–III: An illusion of unbiased assessment. Language, Speech, and Hearing Services in Schools, 31(4), 340–353.
Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Upper Saddle River, NJ: Pearson Education.
Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). Stanford-Binet Intelligence Scale–Fourth Edition. Itasca, IL: Riverside Publishing.
Ukrainetz, T. A. (2005). What to work on how: An examination of the practice of school-age language intervention. Contemporary Issues in Communication Sciences and Disorders, 32, 108–119.
Ukrainetz, T. A., Harpell, S., Walsh, C., & Coyle, C. (2000). A preliminary investigation of dynamic assessment with Native American kindergartners. Language, Speech, and Hearing Services in Schools, 31, 142–154.
U.S. Census Bureau. (2010). 2010 American community survey. Retrieved from http://factfinder.census.gov/
U.S. Census Bureau. (2011). Language use in the United States: 2011 (Report No. ACS-22). Retrieved from https://www.census.gov/library/publications/2013/acs/acs-22.html
van Kleeck, A. (2008). Providing preschool foundations for later reading comprehension: The importance of and ideas for targeting inferencing in book-sharing interventions. Psychology in the Schools, 45(6), 627–643.
van Kleeck, A., Gilliam, R. B., Hamilton, L., & McGrath, C. (1997). The relationship between middle-class parents' book-sharing discussion and their preschoolers' abstract language development. Journal of Speech, Language, and Hearing Research, 40, 1261–1271.
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.
Wallace, G., & Hammill, D. D. (2002). Comprehensive Receptive and Expressive Vocabulary Test–Second Edition (CREVT-2). Austin, TX: Pro-Ed.
Washington, J. A., & Craig, H. K. (Eds.). (1999). Performances of at-risk, African American preschoolers on the Peabody Picture Vocabulary Test–III. Language, Speech, and Hearing Services in Schools, 30(1), 75–82.
Watkins, R. V., Kelly, D. J., Harbers, H. M., & Hollis, W. (1995). Measuring children's lexical diversity: Differentiating typical and impaired language learners. Journal of Speech and Hearing Research, 38, 1349–1355.
Waxman, S. R., & Markow, D. B. (1998). Object properties and object kind: Twenty-one-month-old infants' extension of novel adjectives. Child Development, 69, 1313–1329.
Wells, G. (1986). The meaning makers: Children learning language and using language to learn. Portsmouth, NH: Heinemann Educational Books.
Westby, C. (1985). Learning to talk—Talking to learn: Oral-literate language differences. In C. S. Simon (Ed.), Communication skills and classroom success: Therapy methodologies for language-learning disabled students (pp. 181–213). San Diego, CA: College-Hill Press.
Westby, C., & Torres-Velasquez, D. (2000). Developing scientific literacy: A sociocultural approach. Remedial and Special Education, 21(2), 101–110.
Whimbey, A., & Lochhead, J. (1986). Problem solving and comprehension. Hillsdale, NJ: Erlbaum.
Williams, K. T. (1997). Expressive Vocabulary Test.. Circle Pines, MN: AGS.
Williams, K. T. (2007). Expressive Vocabulary Test–Second Edition (EVT-2). San Antonio, TX: Pearson.
Wolter, J. A., & Pike, K. (2015). Dynamic assessment of morphological awareness and third-grade literacy success. Language, Speech, and Hearing Services in Schools, 46, 112–126.

Appendix A

Sample Nonsense Word and Story From Dynamic Assessment of Inferential Word Learning

Narrative-Based Pretest 1
Say, I'm going to tell you a short story. Please listen carefully. There is a new word in this story. When I'm done, I'm going to ask you about the new word.
Read the story to the child word for word using a moderate pace.
Wash [punup] /pənəp/
Last week, Juan was playing soccer in the backyard. But when he ran to kick the ball, Juan suddenly slipped and fell in a mud puddle that was really deep. He was sad because his new clothes got wet and dirty. Then Juan decided to go inside and tell his mom that his clothes needed to be punuped because they were dirty. After Juan went inside his house he nicely said to his mom, “I need your help, I am so dirty!” Juan's mom said, “I need to punup your clothes!” Then Juan's mom took care of his clothes and he stayed inside and was happy because he wasn't dirty.

Appendix B

Sample Scoring Rubric for the Dynamic Assessment of Inferential Word Learning

Pretest 1 Narrative-Based Scoring Procedure
Say, What does punup mean?
Examiner should wait 5–10 seconds before providing encouragement.
If the child is reluctant, or says that they do not know the answer, the examiner encourages by saying “It's okay. You can guess.”
PointsChild's response
2Target word or synonym: wash, clean
Complete and clear definition: To get the dirt (mud) off.
Example with definition: To wash it in the washing machine.
1Incomplete or unclear definition: To get the stuff off. To take the stuff out.
Example without definition: To put it in the washing machine.
Gesture without verbal response
0Incorrect response: To pick up, to take off.
No response
Pretest 1 Picture-Based Scoring Procedure
Pretest 1 Supplemental Sentence-Based Scoring Procedure
(Miguel got the kamarin score on the test. He was really sad.)
Say, What does kamarin mean?
Examiner should wait 5–10 seconds before providing encouragement.
If the child is reluctant, or says that they do not know the answer, the examiner encourages by saying “It's okay. You can guess.”
PointsChild's response
2Target word or synonym: lowest, worst
Complete and clear definition: To get the lowest score.
Example with definition: To get the lowest score on a test.
1Incomplete or unclear definition: A low score.
Example without definition: To do bad on a test.
Gesture without verbal response
0Incorrect response: best, highest.
No response

Appendix C

Teaching Phase 1 Standardized Procedures

Teaching Phase 1
Say, “Let's try it again with a different story. Please listen carefully. Try to figure out what dalat means. This time, listen to the words around dalat for clues about what dalat means.
Read the story to the child word for word using a moderate pace.
Kick [dalat] /dəlæt/
Yesterday, Gabby was playing soccer in her house with her friend, but when Gabby dalated the ball, it knocked a big beautiful picture off the wall. She was scared because it was a picture that her mother really loved. She was not supposed to dalat the ball in her house because things could get broken. Then Gabby quickly threw the ball outside and decided to nicely ask her tall friend to put the picture back up. She said, “Will you please put the picture back up?” Then Gabby's friend said, “No problem!” After Gabby's friend put the picture up, they went outside. Gabby was happy because the picture wasn't broken.
Lesson 1
Say, Let's try to find clues around dalat to help us figure out what dalat means. Listen to how I figure out what a word means.
“I'm going to listen to word around dalat and see if they help me understand dalat.
“Gabby was playing soccer in her house with her friend, but when Gabby dalated the ball, it knocked a big beautiful picture off the wall.”
“What are the words around dalat that can help us know what it means?”
Wait 5–10 seconds for response. Say, “Well, Gabby was playing soccer and did something to knock a picture off the wall.”
“What do you think Gabby did with the ball to knock the picture off the wall?”
Score response and provide reinforcement: 0, 1, 2.
Continue teaching procedure
Lesson 2
“There are more words in the story that can help us figure out what dalat means. I remember that the story said: She was not supposed to dalat the ball in her house because things can get broken.”
“Listen carefully to the words around dalat.”
“So Gabby was playing soccer and dalated the ball and a picture fell. It also said she wasn't supposed to dalat the ball in her house because things might break.”
“What did the words around dalat tell you? What does dalat mean?”
Score response: Graduated Prompting Modifiability Score: 0, 1, 2
Provide reinforcement and check for meaning in context.
If the child responds incorrectly (0, 1) provide an open-ended question in attempt to elicit the target.
“If you play soccer and knock a picture down, what did you do to the ball?” Check for meaning in context.
Check for Meaning in Context
“I think dalat means to kick. Does that make sense? Let's check. Let's use the word kick instead of dalat. When Gabby kicked the ball it, knocked a big, beautiful picture off the wall. Does that make sense? Ok. Now it's your turn to put that word where dalat was. She was not supposed to ___________ the ball in her house because things could get broke. Wow! I think that makes sense. We figured it out.”

Appendix D

Postteaching Modifiability Scale

The examiner is to complete this modifiability rating scale immediately after the administration of the dynamic assessment. A Total Modifiability Index (TMI) is calculated by adding the points awarded for each of the seven areas. The Mod-7 modifiability score is derived from Item 7.
210
1The child was responsive to promptsmost of the timesome of the timenone of the time
2The child displayed transfer as the testing continuedmost of the timesome of the timenone of the time
3The child attended to the testing/teachingmost of the timesome of the timenone of the time
4The child was easy to teachmost of the timesome of the timenone of the time
5The child did not display frustrationmost of the timesome of the timenone of the time
6The child did not disrupt the testing/teachingmost of the timesome of the timenone of the time
7Overall modifiability rating (Mod-7): What is your overall judgment of the child's potential to learn vocabulary through listening to stories?Good 2Average 1Poor 0
Total modifiability index (TMI; sum of 1 through 7): __________Mod-7 score (Item #7): __________

Appendix E

Dynamic Assessment of Inferential Word-Learning Procedures

Information & Authors

Information

Published In

Language, Speech, and Hearing Services in Schools
Volume 51Number 1January 2020
Pages: 144-164
PubMed: 31855610

History

  • Received: Nov 5, 2018
  • Revised: Mar 18, 2019
  • Accepted: Aug 30, 2019
  • Published online: Dec 17, 2019
  • Published in issue: Jan 8, 2020

Permissions

Request permissions for this article.

Authors

Affiliations

Douglas B. Petersen
Department of Communication Disorders, Brigham Young University, Provo, UT
Penny Tonn
Trina D. Spencer
Department of Child & Family Studies, University of South Florida, Tampa
Matthew E. Foster
Department of Child & Family Studies, University of South Florida, Tampa

Notes

Disclosure: The authors have declared that no competing interests existed at the time of publication.
Correspondence to Douglas B. Petersen: [email protected]
Editor-in-Chief: Shelley Gray
Editor: Mary Alt

Metrics & Citations

Metrics

Article Metrics
View all metrics



Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Citing Literature

  • A Review and Critique of the Predictive Early Assessment of Reading and Language for Use with Spanish-English Bilingual Children, Journal of Latinos and Education, 10.1080/15348431.2025.2477494, (1-9), (2025).
  • Dynamic assessment of word learning as a predictor of response to vocabulary intervention, Journal of Communication Disorders, 10.1016/j.jcomdis.2024.106478, 113, (106478), (2025).
  • A comparison of two dynamic assessment situations for detecting development language disorder in monolingual and bilingual children, Clinical Linguistics & Phonetics, 10.1080/02699206.2024.2435010, (1-19), (2024).
  • Accurately Identifying Language Disorder in School-Age Children Using Dynamic Assessment of Narrative Language, Journal of Speech, Language, and Hearing Research, 10.1044/2024_JSLHR-23-00594, 67, 12, (4765-4782), (2024).
  • Word Learning in Bilingual Children at Risk for Developmental Language Disorder, American Journal of Speech-Language Pathology, 10.1044/2024_AJSLP-23-00489, 33, 6, (2746-2766), (2024).
  • Dynamic Changes Toward Reflective Practice: Documented Shifts in Speech-Language Pathologists' Evaluation Practices, American Journal of Speech-Language Pathology, 10.1044/2024_AJSLP-23-00172, 33, 6, (2921-2938), (2024).
  • A new dynamic word learning task to diagnose language disorder in French-speaking monolingual and bilingual children, Frontiers in Rehabilitation Sciences, 10.3389/fresc.2022.1095023, 3, (2023).
  • Assessing Vocabulary Breadth and Depth, Word of Mouth, 10.1177/10483950231211841b, 35, 3, (7-9), (2023).
  • Curriculum-Based Dynamic Assessment of Narratives for Bilingual Filipino Children, Language, Speech, and Hearing Services in Schools, 10.1044/2022_LSHSS-22-00117, 54, 2, (489-503), (2023).
  • Does Dynamic Assessment Offer An Alternative Approach to Identifying Reading Disorder? A Systematic Review, Journal of Learning Disabilities, 10.1177/00222194221117510, 56, 6, (423-439), (2022).
  • See more

View Options

View options

PDF

View PDF
Sign In Options

ASHA member? If so, log in with your ASHA website credentials for full access.

Member Login

Figures

Tables

Media

Share

Share

Copy the content Link

Share