Open AccessLanguage, Speech, and Hearing Services in SchoolsResearch Article2 Oct 2020

The Access to Literacy Assessment System for Phonological Awareness: An Adaptive Measure of Phonological Awareness Appropriate for Children With Speech and/or Language Impairment



    The Access to Literacy Assessment System–Phonological Awareness (ATLAS-PA) was developed for use with children with speech and/or language impairment. The subtests (Rhyming, Blending, and Segmenting) are appropriate for children who are 3–7 years of age. ATLAS-PA is composed entirely of receptive items, incorporates individualized levels of instruction, and is adaptive in nature.


    To establish the construct validity of ATLAS-PA, we collected data from children with typical development (n = 938) and those who have speech and/or language impairment (n = 227).


    Rasch analyses indicated that items fit well together and formed a unidimensional construct of phonological awareness. Differential item functioning was minimal between the two groups of children, and scores on ATLAS-PA were moderately to strongly related to other measures of phonological awareness. Information about item functioning was used to create an adaptive version of ATLAS-PA.


    Findings suggest that ATLAS-PA is a valid measure of phonological awareness that can be used with children with typical development and with speech and/or language impairment. Its adaptive format minimizes testing time and provides opportunities for monitoring progress in preschool and early elementary classrooms.

    Supplemental Material

    Children with a primary speech and/or language impairment account for 43% of those receiving special education services within schools (U.S. Department of Education, 2017). In addition, many more children require services due to a different primary disability such as cerebral palsy, Down syndrome, or autism, in which there are often associated speech and/or language impairments. Children with speech and/or language impairments regularly struggle to meet educational goals related to literacy achievement beginning in preschool and kindergarten (Anthony et al., 2011; Justice et al., 2009; Pentimonti et al., 2016), often in the area of phonological awareness (Catts et al., 2002; Pentimonti et al., 2016). Phonological awareness (PA), the understanding of the sound structure of language, is an important skill that predicts children's later literacy knowledge (e.g., Lerner & Lonigan, 2016; Lonigan et al., 2008; Wagner et al., 1994). The National Early Literacy Panel (2008) identified PA as one of the most consistent predictors of later literacy achievement for preschoolers, even when considering the contributions of IQ and socioeconomic status. A challenge for understanding individual differences in PA knowledge is the lack of assessments designed for children with speech and/or language impairments. Children with speech and/or language impairments may need assessments that include adaptations for accessibility, including the use of explicit instructions and minimal testing time, using a framework that minimizes unnecessary distractors during the testing process. In this article, we describe the development and validation of the Access to Literacy Assessment System–Phonological Awareness (ATLAS-PA), a new, adaptive measure of PA tailored for children with speech and/or language impairment that is administered using a web-based browser.

    Development of PA

    PA skills emerge early, develop rapidly throughout early childhood, and have strong implications for later literacy achievement, making them an ideal candidate for frequent assessment (Moyle et al., 2013). Individual differences in PA ability can be observed during early childhood and remain fairly consistent over time (e.g., Lonigan et al., 1998; Wagner et al., 1994). For children with typical development, PA seems to benefit from children's tendency to play with language, experiences with print and print-related concepts, and high-quality formal reading instruction (Snow et al., 1998; Torgesen et al., 1994; Troia et al., 1998).

    Children exhibit their knowledge of PA initially by manipulating larger units of sound (e.g., words) and then progressing until they are also able to perform tasks that require parsing words at the level of the phoneme (Anthony & Lonigan, 2004). There is evidence to suggest that further advances in PA are supported by other early reading skills, such that PA provides a pathway through which these abilities build upon themselves (e.g., Cassar & Treiman, 2004; Ehri & Snowling, 2004; Torgesen et al., 1994). Children are commonly asked to perform PA tasks during preschool and the early elementary grades that focus on rhyming (e.g., Anthony & Lonigan, 2004), as well as syllable and sound blending and segmentation (Lonigan et al., 2009). Importantly, these different types of tasks appear to reflect the same latent trait, given prior work in this area suggesting that PA is unidimensional for young children who exhibit typical development (Anthony & Lonigan, 2004; Anthony et al., 2002; Schatschneider et al., 1999).

    Similar to children with typical development, PA is a necessary precursor for functional reading for all students with disabilities, even when the disability is associated with moderate to severe developmental delays that can influence speech output (Browder et al., 2009). There is an abundance of research showing that children with speech and/or language impairment, autism, Down syndrome, and cerebral palsy commonly exhibit lower levels of PA when compared to their peers with typical development (Dessemontet et al., 2017; Dynia et al., 2019; Næss et al., 2012; Peeters et al., 2009; Thatcher, 2010). A range of reasons may contribute to lower PA skills in children with speech and/or language impairment relative to children with typical development. Challenges may reflect difficulties in figuring out the meaningful sound patterns represented in speech (Preston & Edwards, 2010). Lower speech abilities may also contribute to the differences in PA, as verbal speech allows children to play out loud with the sounds of language, facilitating the development of PA; such experiences may be more limited for children with some types of disabilities, thus minimizing their opportunities to develop skills in this area (Peeters et al., 2009). This highlights the need for more accessible PA assessments that can evaluate knowledge for children with a range of linguistic needs and capabilities.

    As with children who develop typically, PA is a significant correlate of early literacy skills for preschoolers with speech sound disorders (Rvachew & Grawburg, 2006) and a predictor of later decoding for many children with disabilities (Dynia et al., 2017; Tambyraja et al., 2015). Even students with remediated speech sound disorders continue to have lower literacy scores in late elementary and early middle school when compared to students who were developing typically (Farquharson, 2015). Findings point to the enduring importance of being able to store and manipulate phonological information using working memory, a skill that appears to be challenging for those who have speech sound disorders (Anthony et al., 2011; Farquharson et al., 2018).

    Importantly, students with disabilities appear to profit from reading instruction that includes attention to PA (Lemons & Fuchs, 2010), including young children with speech and/or language impairment (Skibbe et al., 2011). Experts generally agree that preventing reading difficulties is easier and more cost-effective than working to remediate reading challenges later in a student's career (Francis et al., 1996); ATLAS-PA is thus designed for children attending preschool and early elementary grades to reflect the need to assess children's PA earlier in their school careers.

    Issues With Current Assessments of PA

    Despite the significance of this core early literacy skill, there is a dearth of standardized and validated tools of PA for children with speech and/or language impairment (e.g., Barker et al., 2014; Iacono & Cupples, 2004). The lack of PA assessments that have considered the needs of children with speech and/or language impairment can create barriers to testing within schools (Thurlow, 2010). Some researchers have simply utilized existing assessments of PA even though they may not be valid for children with speech and/or language impairment (e.g., Hesketh, 2004). Others have created their own assessment items with unknown psychometric properties and unclear score interpretations (Dahlgren Sandberg, 2006; Iacono & Cupples, 2004; Vandervelden & Siegel, 2001), or adapted existing assessments without addressing the psychometric characteristics of the adapted version (e.g., Card & Dodd, 2006). These ad hoc adaptations do not necessarily function identically to the original versions, and the adapted formats can have important ramifications for research findings (Dahlgren Sandberg, 2001; Peeters et al., 2009, 2008). For example, Card and Dodd (2006) compared the phonological abilities of children with cerebral palsy who are unable to speak with those who do speak and found between-group differences using some, but not all, types of testing formats. Inconsistent findings on the role of speech in PA ability have led some researchers to consider other aspects of development, such as IQ, to explain findings (Peeters et al., 2008). As a result, without a measure of PA that is validated for children with speech and/or language impairments, we do not have a clear understanding about PA for this group of children and may in fact be misestimating their skills and underestimating their overall cognitive abilities. An accurate assessment of PA skills for students with speech and/or language impairment is critical to hold schools accountable for providing effective literacy instruction to all children (Lemons et al., 2012), especially since the early years of special education often put little emphasis on literacy instruction (Browder et al., 2006). Accurate early language measures help answer the call for educators to monitor academic progress for those receiving specialized services within schools (Lemons et al., 2018), but this can be challenging without the appropriate tools.

    The purpose of ATLAS-PA is to provide a general measure of children's PA skill while also allowing professionals to monitor children's progress in this area over time. In addition to measuring PA skills for children with typical development, ATLAS-PA is designed to be used with children who have an educational identification requiring speech-language services within their schools; however, it is not intended to be used to make a clinical determination of a speech and/or language impairment (see Ireland & Conrad, 2016, for more information about the distinction between these two purposes). This is accomplished by measuring skills commonly incorporated into curricula and targeted as part of interventions in this area (e.g., blending activities included in the Promoting Awareness of Speech Sounds curricula; Roth et al., 2006). Curriculum-based measures, such as ATLAS-PA, are useful progress-monitoring tools that are easy to administer, cost-effective, and able to show development over time. There are several other curriculum-based measures targeting PA currently on the market (e.g., Individual Growth and Development Indicators [IGDIs], Dynamic Indicators of Basic Early Literacy Skills [DIBELS], Phonological Awareness Literacy Screening-PreK), but these were designed for children exhibiting typical patterns of development. This can be problematic, particularly as many curriculum-based measures do not accurately capture PA ability levels for students with speech and/or language impairment; in addition, many are not appropriate for children younger than 4 years of age (Invernizzi et al., 2010; Missall et al., 2006). ATLAS-PA is an adaptive measure of PA, which is made possible by administration using a web-based browser (see Chapelle & Douglas, 2006). Children with disabilities are often taught early reading skills, including areas related to PA, using technology-supported methods of instruction (Grindle et al., 2013; Koppenhaver et al., 2007), so this approach aligns well with current instructional practices in the field.

    Research Questions

    The goal of this study is to describe the development process of ATLAS-PA and to provide validity evidence for ATLAS-PA with a large-scale validation study involving children with speech and/or language impairment as well as children exhibiting typical development. In particular, using a Rasch measurement approach, we considered four research questions to examine construct validity through internal structure and relations to other variables as sources of validity evidence (AERA/APA/NCME, 2014):

    1. Is the ATLAS-PA assessing a unidimensional construct? We hypothesize that, similar to other work in this area (Anthony & Lonigan, 2004; Anthony et al., 2002; Schatschneider et al., 1999), ATLAS-PA would represent a unidimensional construct of PA for children within this age range.

    2. Do the ATLAS-PA items function as expected within the Rasch measurement modeling framework? We anticipate that, based on typical Rasch fit statistics, most items will fit the model well.

    3. Are there any differences in ATLAS-PA item performance attributable to ability group? It is hypothesized that children with speech and/or language impairment would display lower levels of PA when compared to children with typical development, but that the items would function similarly across the groups.

    4. How do scores on the ATLAS-PA relate to scores on other measures of PA? We expect the correlations between the ATLAS-PA and other PA assessments to be consistent with reported correlations among current PA assessments as reported in other validation studies (e.g., .5–.6 as reported by Lonigan et al., 2007).


    Instrument Development

    ATLAS-PA was developed by a research team consisting of experts in early childhood language and literacy development, speech-language pathology, and psychometrics, using a rigorous iterative process involving a panel of early educators, extensive pilot testing, and a large-scale validation study. The first step was to identify what aspects of PA the assessment would target and identify critical features to include in the measure. We convened a panel of 10 early childhood educators to include a number of features that make it useful and valid for children with speech and/or language impairment. Everyone on the panel was associated with one of two university preschools, which serve approximately 200 children and train preservice teachers; participants included the director of the preschools, the associate director of one of the preschools, and eight early childhood educators. All educators had at least a bachelor's degree. Classrooms served children eligible for Head Start or the Great Start Readiness Program, those whose families paid tuition, and those who were receiving early childhood special education services.

    Consistent with the panel's reported classroom practice, we chose to focus ATLAS-PA on three PA skills: rhyming, blending, and segmenting. In consultation with the panel and best practices in special education, we identified four critical features to include in ATLAS-PA: items with concrete content that do not require speech output, design features that increase accessibility and reduce self-regulatory demands, explicit individualized instructions, and the use of adaptive algorithms to minimize testing time. Below, we provide more information about each of the unique design features of ATLAS-PA. We then present an item calibration and construct validation study to ensure that the items work together to validly measure PA (Embretson, 1983); as part of this process, we provide preliminary evidence that scores from ATLAS-PA relate to other measures of PA. Finally, we describe the adaptive algorithms implemented to minimize testing time.

    Item Development

    We developed 120 items to represent the three target areas of PA: rhyming, blending, and segmenting. In order to be appropriate for children with speech and/or language impairment, ATLAS-PA relies entirely on nonverbal response options and only includes items requiring selection among alternative responses (i.e., multiple choice). Although some existing measures employ multiple choice (e.g., Test of Preschool Early Literacy [TOPEL]; Lonigan et al., 2007) or forced choice (e.g., beginning sounds subtest of the Phonological Awareness Literacy Screening; Invernizzi et al., 2003) on some items or subtests relevant for young children, no well-validated and normed measure is based on selection in its entirety for our intended age range. Besides allowing for nonverbal responses, the multiple-choice format offers many advantages such as efficiency and standardization of administration and scoring; however, there are well-known issues with guessing or chance responding on such a test format. To address this problem, ATLAS-PA accounts for guessing using a response cutoff technique available in a Rasch measurement approach (Andrich et al., 2012; Gershon, 1992; Linacre, 2018c).

    For each test item, there are three illustrations along the bottom representing three response options. For rhyming items, a pictorial representation of the target word is also displayed in the center top of the screen. Items are presented using a plain, blank background; prior work has shown that some children with speech and/or language impairment may have challenges attending to literacy materials without getting distracted by irrelevant stimuli (Thompson et al., 2019), so we made the item response options as salient on the display as possible. All test items were implemented electronically using a tablet, similar to what has been done with other adaptive measures of language (Chapelle & Douglas, 2006).

    Test items were created by a panel of four experts, including two speech-language pathologists, one expert on early childhood language and literacy development, and one psychometrician with an applied focus on early childhood language and literacy. Possible response options were balanced for consonant and vowel diversity. The target words were selected to be concrete and familiar to preschool children. A review of existing assessments and curricula of PA were evaluated to create a pool of possible words to use for target words and response options. All possible words were evaluated using the MRC Psycholinguistic Database (Coltheart, 1981), which rates words on a scale from 100 to 700, with larger values indicating a higher level of concreteness. We intentionally chose more concrete words that also were highly imageable based on the judgment of seven scholars, including those with expertise in disabilities related to speech and/or language development, and removed any words that were considered to be less salient from the overall pool of words to be used when creating items. ATLAS-PA includes 279 words (target and response options) ranging from 365 to 670 in levels of concreteness (M = 589.65, SD = 39.56). Words were also analyzed in accordance with the age at which they are typically acquired using ratings from Kuperman et al. (2012). Words utilized were acquired, on average, during preschool (M = 4.83 years old, SD = 1.18 years).

    The audio files for the items were recorded in a sound studio by a voice actor with a Midwestern accent. The stem for each item type (e.g., What rhymes with cat?) is presented, and then each of the three response options are named as the associated pictures are highlighted with a thick black outline. Response time is not considered when calculating a child's score, as some persons with disabilities need additional time to process and respond to test items (J. N. Kaufman et al., 2014). If the child does not respond to a particular item after 5 s, the entire item, including stem and response options, is repeated. A slower item presentation process is available at the behest of the test administrator, with slower speech and greater intervals between response options.

    ATLAS-PA allows for children to respond to items immediately after they hear the test question to minimize testing time and maximize engagement; pilot data with a small sample of children (n = 20) indicated that performance did not change when children were allowed to respond before response options were labeled. However, this approach required us to ensure that the pictorial representations of response options were as clear as possible. To test the degree to which our illustrations elicited the intended vocabulary, we piloted the images in four locations across the United States (California, Michigan, Pennsylvania, Texas). Twenty children with typical development at each site (n = 80 overall) were asked to label all of the illustrations verbally. If more than four children across sites provided the same unintended response (e.g., “leg” for knee), the illustration was edited for clarity and retested.

    Instructions and Practice

    In accordance with recommendations for best practice and feedback from our panel of educators, we created instructions that are explicit, individualized, and include many opportunities for practice (Coyne et al., 2006). Instructions were created using an iterative process that involved piloting directions and practice items with two children exhibiting typical development and four children with speech and/or language impairment related to the verbal production of speech, including two children with autism, one with Down syndrome, and one identified with a speech-language impairment. In addition, field notes were taken during the data collection process to identify whether additional revisions needed to be made to the instructions and practice items.

    To address the differential needs of test takers, the instructions and practice trials on ATLAS-PA were tailored to individual children using a systematic, three-tiered system. Test administrators are able to select one of three levels of instructions and practice: Basic, Basic+, or Enhanced. Most children with typical development will receive instructions at a Basic level, in which children respond to two practice items and are given corrective feedback if needed. This method for introducing the test is typical of commercialized assessments (e.g.,TOPEL), but may not be sufficient for some students with speech and/or language impairment. Our goal was to remove the construct-irrelevant variance resulting from students not understanding the demands of the test while maintaining the integrity of our scores. Thus, we provided varied prompts and opportunities for practice to those students to ensure that item responses were based on a child's level of PA rather than a lack of understanding of the task, but kept the testing module the same across students.

    The next most supportive instructions for ATLAS-PA are the Basic+ level. Basic+ is designed for children who need more thorough instructions and more opportunities for practice, but who can take a test relatively independently (i.e., sit at a computer for 5–10 min with minimal behavioral prompts). Using support strategies found to be successful in prior research (J. Kaufman et al., 2009; Shank et al., 2010; Warschausky, 2009), the Basic+ level provides corrective feedback for up to three trial items if children get the initial practice item incorrect. Teachers in the focus group also cautioned that children may be unfamiliar with certain word choices or technical terms (e.g., rhyme), so corrective feedback varies the language used to support children's skills when possible (e.g., “ends with the same sounds” to supplement rhyme). If necessary, children can also practice with their chosen method for response (e.g., eye gaze). Finally, the Enhanced level of support is intended for children who need assistance from a test administrator, as prior work indicates that children with moderate to severe disabilities may not be able to use electronic literacy materials independently (Thompson et al., 2019). It is recommended that children who are not able to focus on a classroom task for at least 5 min take advantage of the Enhanced level of support. When opting for this type of support, test administrators will encounter a welcome page that will provide examples of behavioral supports that the test administrator may provide during the assessment (e.g., physical guidance, verbal prompts to refocus attention, regular positive reinforcements; Watling & Schwartz, 2004), although the specific supports utilized are at the discretion of the test administrator. Children who do not answer any practice items correctly are automatically moved to a higher level of instructional support (i.e., from Basic to Basic+ to Enhanced). Children must answer at least one practice item correctly by the Enhanced level in order to move onto the testing phase.

    Item Calibration and Construct Validation Study

    To examine the effectiveness of our approach to designing items and individualized instructions, we undertook an evaluation process where we administered all items from ATLAS-PA to a group of children with typical development. We removed all misfitting items and subsequently administered ATLAS-PA to a group of children with speech and/or language impairment.


    Two groups of children (N = 1,165 overall) ages 3;0 (years;months) to 7;11 took ATLAS-PA. The study design and materials were reviewed by the institutional review board at Michigan State University. The study (IRB X15-599e) was determined to be exempt under Category 1, as it only involved normal educational practices. Parents of participants provided written consent before participating, and children provided verbal and/or nonverbal assent before working with research assistants. Participants received a $10 gift card and a children's book for participating in the study.

    The first group of children exhibited typical development as reported by parents and had no current Individualized Education Program (IEP) for speech and/or language impairments (n = 938 [445 girls], Mage = 62.55 months, SD = 14.62 months). Recruitment occurred within 57 schools located in the Midwest. Children with typical development in this study were predominantly White/Caucasian (490 children [52.24%]), followed by Black/African American (223 [23.77%]), multiracial (119 [12.69%]), Asian/Pacific Islander (46 [4.90%]), Hispanic or Latino (40 [4.26%]), Native American (1 [0.11%]), or other (14 [1.49%]). Most parents reported that English was the primary language spoken within their homes (783 [83.48%]); inclusion criteria required parents to affirm that the participating child spoke English fluently. Of the parents who reported that they spoke another language at home, 46 languages were represented. Maternal education varied: some high school (68 mothers [7.25%]), high school diploma or equivalent (146 [15.57%]), some college (234 [24.95%]), undergraduate degree (220 [23.45%]), and graduate/professional school (230 [24.52%]). Annual household income also varied: less than $25,000 (330 households [35.18%]); $25,000–$49,999 (174 [18.55%]); $50,000–$74,000 (108 [11.51%]); $75,000–$99,999 (90 [9.59%]); and more than $100,000 (177 [18.87%]).

    The second group of children had a reported speech and/or language impairment (n = 227 [77 girls], Mage = 66.46 months, SD = 16.31 months). To be eligible for inclusion in the group of children with speech and/or language impairment, children needed to meet the following eligibility criteria: be between 3 and 7 years of age, have goals related to speech and/or language in an IEP, and have parents report that they could understand English in a way that is similar to a native speaker. Recruitment was done through flyers distributed and collected at 90 schools by special education coordinators, teachers, and speech-language pathologists who were familiar with our eligibility criteria. Parents were asked to confirm that children had IEP goals related to speech and/or language and were receiving services related to these areas. Similar to participants with typical development, children with speech and/or language impairment were predominantly White/Caucasian (130 [57.27%]), followed by Black/African American (48 [21.15%]), multiracial (27 [11.89%]), Hispanic or Latino (11 [4.85%]), or other (4 [1.76%]); however, there were fewer Asian/Pacific Islander children (1 [0.44%]; χ21 = 8.29, p < .01) than in the first group, with no Native American children represented. Additionally, most parents reported that English was the primary language spoken within their homes (205 [90.31%]); four other languages were reported to be spoken within participants' homes (i.e., Spanish, Burmese, Albanian, and American Sign Language). Maternal education and annual household income brackets were generally similar among both groups, though, for the second group, there were more mothers who reported having attended some college (χ21 = 7.79, p < .01) or came from households with incomes of under $25,000 (χ21 = 4.68, p = .03), and fewer mothers in the second group held graduate degrees (χ21 = 5.81, p = .02) or had household incomes of $100,000 or greater (χ21 = 17.63, p < .01). Relatively, more boys were in the group with IEPs compared to the group of children with typical development (χ21 = 9.17, p < .01). For age distribution, we attempted to recruit similar numbers of children within each age bracket of 3- to 7-year-olds. The group with typical development had proportionally more 4-year-olds than the group with IEPs (χ21 = 19.16, p < .01), and there were proportionally more 6-year-olds in the second group than in the first (χ21 = 6.43, df = 1, p < .05), but the proportions of 3-, 5-, and 7-year-olds were not significantly different between the two groups.

    The children with speech and/or language impairments represented a broad range of children, who had varying levels of speech production capabilities. In addition to having an IEP for speech/language, children were reported to have the following disabilities and/or impairments: autism spectrum disorder (20 [8.81%]), attention deficit (11 [4.85%]), intellectual disability (11 [4.85%]; includes cognitive impairment, global developmental delay, Down syndrome, and fetal alcohol syndrome), vision impairment (eight children [3.52%]), high social/emotional needs (seven [3.08%]), learning disability (six [2.64%]), hearing impairment (five [2.20%]), movement/coordination problem (three [1.32%]), physical disability (two [0.88%]), and/or cerebral palsy (two [0.88%]). Fifty-seven of the 227 children (25.11%) were reported to have at least one other developmental disability in addition to a speech and/or language impairment. See Table 1 for demographic characteristics broken down by ability group.

    Table 1. Demographic information.

    Demographic variables Children with typical development (n = 938) Children with speech and/or language impairment (n = 227)
    Gender 445 girls, 493 boys 77 girls, 150 boys
     White/Caucasian 490 (52.24%) 130 (57.27%)
     Black/African American 223 (23.77%) 48 (21.15%)
     Hispanic or Latino 40 (4.26%) 11 (4.85%)
     Asian/Pacific Islander 46 (4.90%) 1 (0.44%)
     Native American 1 (0.11%) 0 (0.00%)
     Multiracial 119 (12.69%) 27 (11.89%)
     Other 14 (1.49%) 4 (1.76%)
    Maternal education
     Some high school 68 (7.25%) 20 (8.81%)
     High school diploma or equivalent 146 (15.57%) 33 (14.54%)
     Some college 234 (24.95%) 78 (34.36%)
     Undergraduate degree 220 (23.45%) 46 (20.26%)
     Graduate/professional school 230 (24.52%) 38 (16.74%)
    Annual household income
     Less than $25,000 330 (35.18%) 98 (43.17%)
     $25,000–$49,999 174 (18.55%) 44 (19.38%)
     $50,000–$74,999 108 (11.51%) 37 (16.30%)
     $75,000–$99,999 90 (9.59%) 21 (9.25%)
     More than $100,000 177 (18.87%) 16 (7.05%)

    ATLAS-PA Item Pool

    The initial ATLAS-PA item pool consisted of 120 items: 40 rhyming, 40 blending, and 40 segmenting. In order to keep total testing time under 1 hr while ensuring that all items were administered to a large number of children, we employed a planned missingness design. Children with typical development were randomly assigned two out of the three subtests, and within each subtest, children were randomly assigned 30 of the 40 items, for a total of 60 items administered to each child. After an initial period of data collection, we identified two items as showing substantial statistical misfit, indicating examinee responses were overly unpredictable: rhyming lace (with “face”) and segmenting plate (to “play”). The issue with lace may have arisen from challenges illustrating the word in a way familiar to young children. We did not identify a clear source of the issue with plate. These two items were removed from the item pool early in the data collection process, leaving 118 items in the item pool. Children with speech and/or language impairment were administered all three subtests, and within each subtest were randomly assigned 30 of the 39 or 40 remaining items. An example of each type of item (rhyming, blending, segmentation) is provided in Supplemental Materials S1, S2, and S3, respectively.

    Other Measures of PA

    In addition to taking ATLAS-PA, children exhibiting typical development were administered two other measures of PA chosen based on the child's age relative to the valid age range of the measure. The other PA measures are not appropriate for all of the children with speech and/or language impairment and thus were not administered to this group.

    TOPEL. For the children with typical development, all 3-year-olds and half of the 4- to 6-year-olds (n = 442) were administered the PA subtest of the TOPEL (Lonigan et al., 2007). This subtest requires children to put sounds together to form a new word (blending) and to remove sounds from a word to form a new word (elision). Children were given prompts such as, “Point to the word you get when you say ‘tooth’–‘brush’ together.” As reported in the test manual, the internal consistency of TOPEL PA is .87 and test–retest stability over a 2-week period was .83 (Lonigan et al., 2007).

    Comprehensive Test of Phonological Processing–Second Edition. For the children with typical development, half of the 4- to 6-year-olds were administered the ages 4–6 years version of the Blending and Elision subtests of the Comprehensive Test of Phonological Processing–Second Edition (Wagner et al., 2013). All 7-year-olds with typical development were administered the ages 7–24 years version of the Blending and Elision subtests. The Elision subtest requires children to identify or name the word that remains when a part of the word has been removed. Items ranged between elision of compound words (e.g., toothpaste without tooth), of syllables (e.g., catcher without -er), and of phonemes (e.g., grain without -n). Blending involves combining words into compound words, syllables into words, and phonemes into words. The Blending and Elision subtest raw scores are converted to percentile ranks and scaled scores via score lookup tables to yield a composite PA score. Internal consistency for the Comprehensive Test of Phonological Processing–Second Edition PA composite score is .92 for the ages 4–6 years version and .93 for the ages 7–24 years version (Wagner et al., 2013).

    Preschool Early Literacy Indicators. For the children with typical development, all 3- and 4-year-olds (n = 584) were administered the PA subtest of the Preschool Early Literacy Indicators (Kaminski et al., 2018). This subtest assesses preschool-age children's ability to identify or say the first part or the first sound of a word (e.g., the first part of rainbow is rain). Interrater reliability ranges from .90 to .98.

    DIBELS Next. Kindergarteners with typical development (n = 136; the DIBELS Next is grade based rather than age based) received the First Sound Fluency (FSF) subtest of the DIBELS Next. FSF provides 1 min for children to say the first sounds of orally introduced words. In addition, for the children with typical development, all kindergarten, Grade 1, and Grade 2 children (n = 340) received the Phoneme Segmentation Fluency subtest, which asks children to listen to orally introduced words and say all of the sounds in the word. The alternate-form reliability for the FSF and Phoneme Segmentation Fluency subtests is .72 and .88, respectively (Dewey et al., 2012).


    We used a Rasch measurement approach to examine the construct validity of the item pool of the ATLAS-PA. Rasch measurement is a form of item response analysis that offers a strong approach to validation at the item level (Bond & Fox, 2015; Rost, 2001), yielding scores that have good evidence of interval scaling (Perline et al., 1979). The specific model we used was the Rasch (1960) model for dichotomous outcomes:

    ln P ni / 1 P ni = θ n β i (1)

    where Pni is the probability of examinee n with trait level θn (i.e., PA level) succeeding on item i, which has difficulty level βi. Data were analyzed using the Rasch measurement software Winsteps (Linacre, 2018c). Because of the potential for guessing with multiple-choice items, we used the CUTLO = −1 option, which treats individual item responses with expected probability of correct response below .27 as missing. Alternative values for CUTLO yielded identical results.

    To address the first research question regarding dimensionality, we used a Rasch principal components analysis of residuals (Linacre, 1998). To address the second research question regarding item functioning, we examined the fit of the items to the Rasch model using standard Rasch fit statistics, Infit and Outfit. To address the third research question about differences in item functioning across gender and ability group (typically developing and speech and/or language impairment), we performed a differential item functioning (DIF) analysis. Finally, to address the fourth research question about nomothetic span, we examined correlations between scores from the ATLAS-PA and other measures of PA.



    We first tested the dimensionality of ATLAS-PA using a principal components analysis of the residuals, which considers the pattern of discrepancies between observed and predicted scores (Linacre, 1998). Conceptually, this analysis examines whether, after accounting for the primary measurement dimension, some items remain more related than expected, suggesting those items share a second dimension. Using the recommendations of Linacre (2018a), we identified potential multidimensionality if (a) at least one component beyond the primary measurement dimension had an eigenvalue above 2, (b) a scree plot of the eigenvalues had a clear elbow, (c) the disattenuated correlation between θ as estimated separately on potential components was substantially less than 1, and/or (d) the potential components included interpretable differences in item content.

    For ATLAS-PA, the first eigenvalue was 3.1, suggesting potential multidimensionality. However, there was no clear elbow, as the next four eigenvalues were 2.8, 2.4, 2.0, and 1.8. The first principal component separated rhyming items from blending items, although the disattenuated correlation between θ as estimated separately from the contrasted items (i.e., from blending items alone vs. from rhyming items alone) was 1.00. No other principal component had interpretable content, and no disattenuated correlation was below .89. Thus, there was, at most, weak evidence of multidimensionality, and we concluded that ATLAS-PA was essentially unidimensional.

    Item Functioning

    We next examined item fit to establish that the items measure PA validly within a Rasch measurement framework. Following standard practice, we consulted both Infit and Outfit mean-square values; each has an expected value of 1.0, with higher values indicating the item fits poorly or has excess noise and lower values indicating the item offers less information than expected. We considered values between 0.6 and 1.4 to indicate good fit (Wright & Linacre, 1994). Of the initial item pool of 118 items (120 original items less two that were removed before the validation process), all 118 items' infit values fell within this range, showing excellent fit to the model. However, four items displayed outfit mean-square values above 1.4, suggesting some unpredictable responses: one blending item (/k/ + /aʊ/ = cow, outfit = 1.58) and three segmenting items (“cable” to bull, outfit = 1.45; “walrus” to wall, outfit = 1.43; and “cape” to ape, outfit = 1.41). We removed these items from the item pool, which then contained 114 items. There was no clear pattern in content for the misfitting items, suggesting that the item pool as a whole is validly measuring PA.


    As additional evidence of validity, we conducted DIF analyses to examine whether an item behaves differently for different groups, controlling for overall level of PA. We considered DIF across gender and, separately, DIF across disability status (typical development and disability related to speech and/or language production). We followed conventions used by the Educational Testing Service (Zwick et al., 1999) for identifying moderate to strong DIF converted into a Rasch difficulty metric (Linacre, 2018b): statistical significance (p < .05) using a Rasch–Welch t test with a difficulty difference (DIF contrast) of at least .64 logits, which implies that when person ability and item difficulty are well matched, the probability of a correct response differs by about .12 across the two groups. As preliminary analyses, we compared overall levels of PA for these groups. Girls (n = 522; M = 1.28 logits, SD = 1.95) and boys (n = 637; M = 1.10 logits, SD = 1.96) did not perform significantly differently on the ATLAS-PA (d = 0.09, t(1157) = 1.4845, p = .1379). As expected, children with speech and/or language impairment (M = 0.89 logits, SD = 1.77) had lower overall PA ability than children with typical development (n = 938; M = 1.25 logits, SD = 1.96), d = 0.19, t(371.48) = 2.72, p = .007. Although differences between the two groups were relatively small in magnitude, results should be interpreted in light of the fact that the children with speech and/or language impairment were, on average, about 4 months older than the children with typical development in the present work.

    Four of the 114 remaining items met the DIF criteria for either gender or disability status. One item had DIF associated with gender: Blending snake (/sne/ + /k/) was significantly easier for girls (DIF contrast = .72). Two items displayed DIF associated with disability status: Blending dog [/dɔ/ + /g/] was easier for children with typical development (DIF contrast = .71), while blending toothbrush [tooth + brush] was easier for children with speech and/or language impairment (DIF contrast = .67). One item, segmenting “cartoon” (to yield car), had DIF on both gender and disability: easier for girls (DIF contrast = .68) and for children with speech and/or language impairment (DIF contrast = .91). These four items were removed from the item pool, leaving a final item pool consisting of 110 items, with 39 rhyming, 35 blending, and 36 segmenting items.

    Final Item Pool

    Figure 1 shows the Wright map reflecting examinee ability and item difficulty on a shared scale. Child PA ability ranged widely from −4.60 to +5.96 (M = 1.16) logits, but with most of the sample falling near the range of the item difficulty, which ranged from −0.99 (easiest) to +1.77 logits (most difficult). However, there was a substantial proportion of individuals whose level of PA fell above the range of the items. PA ability, as measured by the ATLAS-PA, was associated with age (r = .53, p < .001; M3-year-olds = −0.09, M4 = 0.39, M5 = 1.15, M6 = 2.52, M7 = 2.96). The PA levels of three-, four-, and five-year-old children were best matched to the difficulty of the ATLAS-PA items. By subtest, blending items were the easiest on average (−0.27 logits), with rhyming (+0.20) and segmenting (+0.17) being slightly more difficult. However, rhyming, blending, and segmenting item difficulty ranges overlapped substantially, and no one item type clustered into one area of the difficulty range; hence, all subtests provided information across the range of examinee ability levels. Estimated Rasch-based reliability was .91 for the children exhibiting typical development and .94 for children with speech and/or language impairment; however, we used a measurement precision stopping rule for the adaptive version, which allows us to select the intended reliability level of .90 for the final version of the ATLAS-PA (see below).

    Figure 1.

    Figure 1. Wright map of Access to Literacy Assessment System for Phonological Awareness examinee ability and item difficulty, in logits. PA = phonological awareness; M = mean; S = 1 standard deviation from mean; T = 2 standard deviations from mean. Each pound sign (#) indicates four children; each dot indicates one to three children. Each X indicates one item.

    Nomothetic Span

    Finally, we looked at the nomothetic span (Embretson, 1983) of the ATLAS-PA to examine the relations between the ATLAS-PA and other measures of PA. For this analysis, we restricted our sample to children evidencing typical development, as the other measures of PA were not designed for children with speech and/or language impairment and therefore scores on these assessments could not be considered valid for these children. Furthermore, children only completed assessments intended for their current age or grade (e.g., only 3- and 4-year-olds took the Preschool Early Literacy Indicators), so there was some restriction of range in PA skills. Table 2 presents the correlation matrix of ATLAS-PA with other PA batteries. Consistent with our expectations, the ATLAS-PA was moderately to strongly correlated with all other PA measures (r = .49–.65), at roughly the same magnitude as the other measures were correlated with each other. This suggests that ATLAS-PA is measuring the construct of PA in a similar way as validated, published batteries.

    Table 2. Correlations among measures of phonological awareness for children with typical development.

    ATLAS-PA 1.00 (n = 938)
    TOPEL .56 (n = 441) 1.00 (n = 442)
    CTOPP 4–6 .65 (n = 371) NA 1.00 (n = 371)
    CTOPP 7+ .58 (n = 97) NA NA 1.00 (n = 97)
    DIBELS-FSF .49 (n = 136) .63 (n = 40) .54 (n = 96) NA 1.00 (n = 136)
    DIBELS-PSF .56 (n = 340) .60 (n = 42) .57 (n = 199) .25 (n = 97) .69 (n = 133) 1.00 (n = 340)
    PELI .52 (n = 583) .60 (n = 397) .57 (n = 168) NA NA NA 1.00 (n = 584)

    Note. For all correlations, p < .001. ATLAS-PA = Access to Literacy Assessment System–Phonological Awareness; TOPEL = Test of Preschool Early Literacy; CTOPP 4–6 = Comprehensive Test of Phonological Processing–Second Edition, ages 4–6 years version; CTOPP 7+ = Comprehensive Test of Phonological Processing–Second Edition, ages 7+ years version; DIBELS-FSF = Dynamic Indicators of Basic Early Literacy Skills–First Sound Fluency; DIBELS-PSF = Dynamic Indicators of Basic Early Literacy Skills– Phoneme Segmentation Fluency; PELI = Preschool Early Literacy Indicators; NA = not applicable.


    Results demonstrate that ATLAS-PA is a reliable and valid measure of PA for children with and without speech and/or language impairment. The items utilized for ATLAS-PA represented a unidimensional construct of PA, consistent with several other studies in this area (Anthony & Lonigan, 2004; Anthony et al., 2002; Schatschneider et al., 1999). It is likely that PA skills are separate from other areas of language functioning (Anthony et al., 2014), thus warranting measures that focus explicitly on this skill set. Furthermore, evidence from our data suggests that ATLAS-PA is moderately to strongly related to other commonly used measures of PA, indicating the ways in which captured PA knowledge is reflective of other measures in the field.

    Item Functioning and Validity for Children With Speech and/or Language Impairment

    ATLAS-PA includes items that work well for a broad range of children. For those with typical development, ATLAS-PA appears to be best suited for those who are between 3 and 6 years of age, as the items were not well targeted to the PA levels of children with typical development who were 7 years of age. However, children with speech and/or language impairment often display lower levels of PA (Dessemontet et al., 2017; Dynia et al., 2019; Peeters et al., 2009), so educators should consider children's instructional level, in addition to their age, when determining whether ATLAS-PA could prove informative for their students with speech and/or language impairment. In the present work, although there were ceiling effects for the 7-year-olds with typical development, ATLAS-PA measured PA skills well for the 6- and 7-year-olds with speech and/or language impairment included in our sample, justifying a larger age range for this group of children.

    For PA, testing format can affect performance and score interpretation (e.g., Card & Dodd, 2006), making it important to examine item functioning directly for children with and without speech and/or language impairment. Analyses here demonstrated that the vast majority of items worked in a similar fashion for all children, suggesting that we are able to assess PA skills accurately for those children not able to answer the expressive items often used in other tests of PA. The accurate assessment of PA for children with speech and/or language impairment should increase educators' capacity to provide early reading instruction individualized to each student's needs.

    Unique Features of ATLAS-PA

    There are several innovative features that make ATLAS-PA unique among tests of PA. First, measures of PA often require children to respond to items verbally, which can be challenging for many children with speech and/or language impairment (e.g., TOPEL; Lonigan et al., 2007). ATLAS-PA uses only receptive items to capture children's knowledge of PA, allowing it to be used by a greater number of children than many other measures. Second, ATLAS-PA includes three tiers of individualized instructions and practice items, increasing the probability that children with speech and/or language impairment can access the measure in ways that are accessible to them. Third, the measure was designed using a plain background and simple format that drew children's attention to relevant parts of the item, without including commonly used interactive features (e.g., hot spots or games) that require children to multitask or that could distract from children's performance on the assessment (Bus et al., 2015). Fourth, it is well recognized that many disabilities are associated with processing speed (Calhoun & Mayes, 2005), yet a number of early literacy assessments include speed of response as part of test administration (e.g., DIBELS 8th Edition; University of Oregon Center for Teaching and Learning, 2018). ATLAS-PA allows for variation in the speed of administration to increase confidence that scores reflect knowledge of PA specifically, rather than construct irrelevant variance associated with the way in which the task is administered.

    The data gathered to validate the item pool for ATLAS-PA were used to make it an adaptive test. Adaptive tests tend to yield the same measurement precision as nonadaptive tests, but with approximately half as many items (Madsen, 1991; Weiss, 1982), making adaptive testing an effective way to reduce both physical and attentional fatigue associated with many disabilities. Each subtest in the adaptive version is administered separately with items administered until a minimum level of measurement precision is achieved. Most commonly, a child will be administered eight items on each subtest to yield measurement precision equivalent to a subtest reliability between .75 and .80. If a child completes all three subtests, the reliability of the overall PA score is above .90. Testing time for all three subtests on the adaptive version usually takes under 10 min.

    Clinical Implications

    ATLAS-PA provides general information about children's PA using a web-based platform that is easy to access and inexpensive. As with other curriculum-based measures (e.g., IGDIs), ATLAS-PA can be used to monitor children's progress over time. However, unlike other similarly structured measures, ATLAS-PA relies entirely on receptive responses and includes individualized levels of instruction. PA is one of the key targets for early literacy assessment (Lonigan et al., 2009), instruction (Lemons & Fuchs, 2010), and intervention (Hund-Reid & Schneider, 2013; Skibbe et al., 2011). Given the strong predictive value of PA for later reading development (National Early Literacy Panel, 2008), ATLAS-PA represents an important tool that educators and practitioners can use to evaluate and monitor children's PA skill development during early childhood.

    By utilizing an adaptive format, ATLAS-PA will minimize the amount of time students need to spend being assessed. Attention is linked to language throughout early childhood (Gooch et al., 2016) and is a concern for many children with speech and/or language impairment (Maher et al., 2015), making a shorter testing time an especially important consideration. In addition, the adaptive nature of this measure also facilitates its use as a progress monitoring tool, as children will be exposed to different items on each testing occasion. General outcomes progress monitoring tools, such as the IGDIs (McConnell et al., 2002), have long been touted as the best way to capture young children's development during preschool and kindergarten, but are often not inclusive of children with speech and/or language impairment.

    Limitations and Future Directions

    Our goal was to create a measure of PA that could be used by children who had an educational determination of speech and/or language impairment within their schools. By design, this included children whose challenges resulted from a variety of disabilities, including speech and/or language impairment, autism, physical disabilities, and intellectual disabilities. Also, since we did not measure speech or language abilities directly, it is possible that a small percentage of the sample identified as having typical development actually had an undiagnosed speech and/or language impairment. Across etiologies, items functioned well; however, we did not have a large enough sample to detect whether particular items are problematic for specific disability groups. Some research, for example, has suggested that color memory and perception are more challenging for children with autism (Franklin et al., 2008), yet some of our items included pictures representing color words (e.g., red). In addition, the variability in motor, cognitive, and vocal capabilities within our sample precluded the use of a gold standard measure of PA for our children with speech and/or language impairment. Understanding whether construct validity varies for a particular subsample will be an ongoing area of future research.

    Item responses from the ATLAS-PA are captured in a central data repository, which will allow us to consider, as more data become available, whether individual differences in performance can be attributed to particular types of disabilities. ATLAS-PA was designed to be appropriate for a broad range of children with speech and/or language impairment, although we did not confirm any reported diagnoses with direct assessment. We also recognize that, although ATLAS-PA is accessible to a broader range of children than other assessments, additional work may be needed to increase its accessibility. In the present work, 20 children with speech and/or language impairments were excluded from the present work because they could not complete ATLAS-PA; of these, nine were minimally verbal and all had behaviors that interfered with the testing process, a common barrier for assessment of children with higher needs (Tager-Flusberg et al., 2017). Note that three children who were minimally verbal were able to complete ATLAS-PA without difficulty, so the absence of spoken language did not appear to be a barrier for completion in isolation of challenging behaviors. Additional work is needed to consider how to expand the population of students for whom ATLAS-PA is relevant and valid.


    ATLAS-PA is a new adaptive measure of PA with strong psychometric properties, available to interested users at (website launch anticipated spring 2020). By designing items and instructions to be appropriate for children who have speech and/or language impairment, this measure allows researchers and educational professionals to capture a more accurate assessment of PA skills for children with speech and/or language impairment than other PA measures currently available in the field. Specifically, ATLAS-PA provides educational professionals and researchers with the following: (a) a more effective tool to assess this critical literacy skill among children across the ability spectrum, (b) opportunities to include a broader population of children in regular early elementary screening systems rather than relying on modified assessments or tests with often nonvalidated and thus questionable accommodations, and (c) a PA assessment that relies on more rigorous approaches to validation for children with speech and/or language impairment than typically employed with such assessments.

    Author Contributions

    Lori E. Skibbe: Conceptualization (Equal), Data curation (Equal), Funding acquisition (Lead), Writing - Original Draft (Lead), Writing - Review & Editing (Lead). Ryan P. Bowles: Conceptualization (Equal), Data curation (Lead), Formal analysis (Lead), Funding acquisition (Supporting), Methodology (Lead), Writing - Original Draft (Supporting), Writing - Review & Editing (Supporting). Sarah Goodwin: Data curation (Supporting), Formal analysis (Supporting), Validation (Supporting), Visualization (Lead), Writing - Original Draft (Supporting), Writing - Review & Editing (Supporting). Gary A. Troia: Conceptualization (Supporting), Funding acquisition (Supporting), Writing - Original Draft (Supporting), Writing - Review & Editing (Supporting). Haruka Konishi: Methodology (Supporting), Project administration (Supporting), Writing - Original Draft (Supporting), Writing - Review & Editing (Supporting).


    The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R324A150063 (PI: Skibbe). The opinions expressed are those of the authors and do not represent views of the Institute of Education Sciences or the U.S. Department of Education.


    Author Notes

    Disclosure: The authors have declared that no competing interests existed at the time of publication.

    Correspondence to Lori E. Skibbe:

    ATLAS-PA is an author-designed tool, although the authors currently have no financial interests related to this assessment.

    Editor-in-Chief: Holly L. Storkel

    Editor: Douglas Bryan Petersen

    Additional Resources