Which Questions Do Children With Cochlear Implants Understand? An Eye-Tracking Study.

Purpose The purpose of this study was to investigate the processing of morphosyntactic cues (case and verb agreement) by children with cochlear implants (CIs) in German which-questions, where interpretation depends on these morphosyntactic cues. The aim was to examine whether children with CIs who perceive the different cues also make use of them in speech comprehension and processing in the same way as children with normal hearing (NH). Method Thirty-three children with CIs (age 7;01-12;04 years;months, M = 9;07, bilaterally implanted before age 3;3) and 36 children with NH (age 7;05-10;09 years, M = 9;01) received a picture selection task with eye tracking to test their comprehension of subject, object, and passive which-questions. Two screening tasks tested their auditory discrimination of case morphology and their perception and comprehension of subject-verb agreement. Results Children with CIs who performed well on the screening tests still showed more difficulty on the comprehension of object questions than children with NH, whereas they comprehended subject questions and passive questions equally well as children with NH. There was large interindividual variability within the CI group. The gaze patterns of children with NH showed reanalysis effects for object questions disambiguated later in the sentence by verb agreement, but not for object questions disambiguated by case at the first noun phrase. The gaze patterns of children with CIs showed reanalysis effects even for object questions disambiguated at the first noun phrase. Conclusions Even when children with CIs perceive case and subject-verb agreement, their ability to use these cues for offline comprehension and online processing still lags behind normal development, which is reflected in lower performance rates and longer processing times. Individual variability within the CI group can partly be explained by working memory and hearing age. Supplemental Material https://doi.org/10.23641/asha.7728731.

NH, whereas they comprehended subject questions and passive questions equally well as children with NH. There was large interindividual variability within the CI group. The gaze patterns of children with NH showed reanalysis effects for object questions disambiguated later in the sentence by verb agreement, but not for object questions disambiguated by case at the first noun phrase. The gaze patterns of children with CIs showed reanalysis effects even for object questions disambiguated at the first noun phrase. Conclusions: Even when children with CIs perceive case and subject-verb agreement, their ability to use these cues for offline comprehension and online processing still lags behind normal development, which is reflected in lower performance rates and longer processing times. Individual variability within the CI group can partly be explained by working memory and hearing age. Supplemental Material: https://doi.org/10.23641/asha. 7728731 N owadays, many children with severe-to-profound hearing loss (i.e., pure-tone average ≥ 70 dB HL) receive cochlear implants (CIs). CIs are able to restore hearing by bypassing the malfunctioning inner ear and electrically stimulating the auditory nerve directly. The sound provided from the CI is degraded as compared to acoustic sound (Drennan & Rubinstein, 2008). Nevertheless, many CI users gain substantially from their implants (Krueger et al., 2008). Children with profound hearing loss using CIs develop language faster than children with profound hearing loss using conventional hearing aids (Geers & Moog, 1994;Geers, Nicholas, & Sedey, 2003;Svirsky, Robbins, Iler-Kirk, Pisoni, & Miyamoto, 2000;Tomblin, Spencer, Flock, Tyler, & Gantz, 1999).
At the same time, large individual differences with respect to language development and speech intelligibility have been found in children with CIs (e.g., for German, Szagun 2001; for Dutch, Giezen, 2011;Gillis, Schauwers, & Govaerts, 2002;for English, Niparko et al., 2010;Peterson, Pisoni, & Miyamoto, 2010;Stacey, Fortnum, Barton, & Summerfield, 2006;Svirsky, Teoh, & Neuburger, 2004;for French, Duchesne, Sutton, & Bergeron, 2009;Le Normand, Ouellet, & Cohen, 2003). Some of these individual differences with respect to language development can be explained in terms of age at implantation. It is argued that the earlier the children receive their implant, the better their hearing outcomes and language outcomes (Boons et al., 2012; a Harrison, Gordon, & Mount, 2005;Sharma, Dorman, & Spahr, 2002;Szagun & Schramm, 2016), especially with respect to grammar and pragmatics (Tobey et al., 2013). This holds for comparisons of implantation at very young ages. For example, language skills of children implanted at 12 months were better than those implanted at 24 months (Lesinski-Schiedat, Illg, Heermann, Betram, & Lenarz, 2004;Tomblin, Barker, Spencer, Zhang, & Gantz, 2005). It also holds for children with somewhat later implantation: Children implanted before the age of 3 years developed language faster than children implanted between the age of 3 and 5 years (Kirk et al., 2002). Therefore, investigating this group of children is interesting in terms of a critical/sensitive period for language acquisition (Ruben, 1997).
Another explanation for the individual differences in language development relates to cognitive capacity. Working memory capacity in children with CIs has been found to correlate with their speech and language outcomes Kronenberger, Pisoni, Harris, et al., 2013;Pisoni, Kronenberger, Roman, & Geers, 2011). Children with CIs score lower on cognitive tasks than children with normal hearing (NH) and are at a higher risk of cognitive deficits (such as executive functioning deficits) than children with NH (Kronenberger, Beer, Castellanos, Pisoni, & Miyamoto, 2014). The cause of these cognitive differences is unknown. Also, the development of cognitive domains related to executive function and cognitive control processes and brain regions associated with working memory is influenced by auditory and linguistic experience (Giraud & Lee, 2007;Pisoni, Conway, Kronenberger, Henning, & Anaya, 2012). Thus, differences in cognitive development have been argued to be related to differences in language development.
The majority of language-related studies on children with CIs evaluate speech perception or speech production (for an overview, see Nikolopoulos, Dyar, Achbold, & O'Donoghue, 2004). Surely, good speech perception is essential in language development. However, good speech perception does not necessarily imply good understanding of all relevant aspects of language (morphology, semantics, syntax). Solely concentrating on how well children with CIs perceive certain words or phrases may therefore unintentionally leave problems in language comprehension unexposed.
In the current study, we want to investigate the perception, understanding, and processing of morphosyntactic cues (case and verb agreement) by children with CIs in German which-questions. Questions are particularly interesting to study in children with CIs, because these are very common constructions in spoken language and hence very relevant for communication. In addition, the constructions studied here are acquired relatively late (see below; Biran & Ruigendijk, 2015;Lindner, 2003;Roesch & Chondrogianni, 2015;Schouwenaars, Hendriks, & Ruigendijk, 2018) and thus could be affected more in children with CIs. Furthermore, German which-questions allow us to examine the effects of several aspects of sentence processing, namely word order, agreement, and case, as well as the difference between actives and passives (see below). In German, subject and object which-questions have the same structure, namely, NP-V-NP. Therefore, these questions can be ambiguous between a subject question interpretation and an object question interpretation (see (1)). In sentence processing, often thematic roles are assigned linearly: The first noun phrase (NP) is the agent, and the second NP is the patient. However, in German which-questions, the thematic roles cannot be assigned linearly as the object may precede the subject. Previous studies have found that which-questions turn out to be approximately equally distributed between subject-and object-initial sentence structures (Schlesewsky, Fanselow, Kliegl, & Krems, 2000). Morphosyntactic cues such as case (see (2)) and/or verb agreement (see (3)) need to be used to correctly interpret subject and object questions.
(1) Welche Maus fängt die Ente? which NOM/ACC mouse catch the NOM/ACC duck "Which mouse is catching the duck?" (subject question) "Which mouse is the duck catching?" (object question) (2a) Welcher Esel fängt den Tiger? which NOM donkey catch the ACC tiger "Which donkey is catching the tiger?" (subject question) (2b) Welchen Esel fängt der Tiger? which ACC donkey catch the NOM tiger "Which donkey is the tiger catching?" (object question) (3a) Welche Maus fängt die Enten? which NOM/ACC mouse SG catch SG the NOM/ACC ducks PL "Which mouse is catching the ducks?" (subject question) (3b) Welche Maus fangen die Enten? which NOM/ACC mouse SG catch PL the NOM/ACC ducks PL "Which mouse is the duck catching?" (object question) Whereas (1) is ambiguous between a subject question interpretation and an object question interpretation, (2a) can only be interpreted as a subject question, and (2b) can only be interpreted as an object question. In German, unlike singular feminine nouns (used in (1)), singular masculine nouns (used in (2)) have different case marking for nominative and accusative case. Nominative case in (2a) on the wh-phrase (welcher "which") marks the first NP as the subject of the sentence. In addition, the accusative case on the article of the second NP (den "the") marks the second NP as the object. Likewise, the accusative case in (2b) on the wh-phrase (welchen "which") marks the first NP as the object of the sentence, and the nominative case on the article of the second NP (der "the") marks this NP as the subject. Besides case marking, verb agreement can also indicate the subject of the sentences. In (3a), only the first NP (welche Maus "which mouse") corresponds in number with the singular inflection on the verb ( fängt "catches"), and therefore, it must be the subject, which makes (3a) a subject question. In (3b), only the second NP (die Enten "the ducks") corresponds in number with the plural inflection on the verb ( fangen "catch"), and therefore, it is the subject, which makes (3b) an object question. The type of cues and their position in the sentence make German which-questions particularly suitable to investigate how children with CIs use morphosyntactic cues for interpretation.
In German, as shown in the previous examples, these morphosyntactic cues are essential for correct comprehension of which-questions. Besides object questions, an investigation of children with CIs' processing of passive questions is particularly relevant because, like in object questions, in passive questions the patient precedes the agent (see (4)). Compared to object questions, passive questions are disambiguated by different cues (verb werden "to be," by-agent, and past participle) that are perceptually more prominent than the morphosyntactic cues case and verb agreement. Comparing object questions and passive questions may shed light on whether deviations of the standard agent-first word order cause difficulties in language acquisition or whether the problem is more specific and related to disambiguation cues. Note that the correct use of morphosyntactic information is not only crucial for a good understanding of which-questions but is also needed for the interpretation of wh-questions in general, as well as relative clauses, cleft sentences, and topicalized sentences.
(4) Welche Maus wird von der Ente gefangen? which NOM/ACC mouse is-being by the DAT duck caught "Which mouse is being caught by the duck?" (passive question) In this article, we will first review previous studies on the development of morphosyntax in children with NH and children with CIs. Based on the previous findings, we will formulate predictions. Next, we will describe the experiment with which we tested these predictions. Then, we will give an overview of the results on children's offline and online comprehension. Finally, we will discuss the results and draw conclusions.

Comprehension of Which-Questions in Children With Typical Development
The comprehension of object which-questions is quite a long ride for German children with NH and typical development. Their ability to use case marking for thematic role assignment starts to develop around the age of 5 years, but even older children still make many mistakes (Biran & Ruigendijk, 2015;Lindner, 2003;Roesch & Chondrogianni, 2015;Schouwenaars et al., 2018). Their ability to use verb agreement for thematic role assignment is less studied in German and develops later than case (Arosio, Yatsushiro, Forgiarini, & Guasti, 2012) around the age of 7 years (Schouwenaars et al., 2018). The late development of verb agreement as a cue for thematic role assignment has also been found in other languages, such as Italian and Dutch, where children still misinterpret object questions disambiguated solely by verb agreement until the age of 8 or 9 years (e.g., De Vincenzi, Arduino, Ciccarelli, & Job, 1999;Metz, van Hout, & van der Lely, 2010;Schouwenaars, van Hout, & Hendriks, 2014).
Children's use of morphosyntactic cues for thematic role assignment is related to their working memory capacity. Children with a low digit span have interpretation problems with object relative clauses regardless of the type of disambiguation cue (case or verb agreement). Children with medium digit span have interpretation problems only with object relative clauses that are disambiguated by verb agreement, and children with higher digit span have no difficulties in the comprehension of object relative clauses at all (Arosio et al., 2012). According to processing theories such as the active filler hypothesis (Frazier & Flores d'Arcais, 1989), when processing relative clauses or wh-questions, the wh-phrase is interpreted as the subject of the sentence due to the hypothesis that an identified filler (i.e., here the wh-phrase) is determined as soon as possible. When the wh-phrase turns out not to be the subject but the object (as in (3b)), the interpretation needs to be revised, and this is argued to cause processing difficulties. In order to revise, enough working memory resources need to be available (Deevy & Leonard, 2004). More generally, the processing of which-questions may involve maintaining the object in working memory for a longer period or keeping more representations at the same time active, which also requires sufficient working memory capacity (e.g., Fiebach, Schlesewsky, & Friederici, 2002;Gibson, 1998). Instead of using morphosyntactic cues, young children assign thematic roles linearly. They rely on word order, assuming that the first NP is the agent and subject (e.g., Slobin & Bever, 1982). Therefore, they interpret object questions as subject questions (e.g., Biran & Ruigendijk, 2015;Roesch & Chondrogianni, 2015;Schouwenaars et al., 2014). Children with CIs may rely on word order even more heavily because the morphosyntactic cues are subtle cues, which are perceptually difficult to recognize (Szagun, 2000). Furthermore, if the comprehension of object-first structures is related to working memory, children with CIs may have more problems with object questions as they score lower on working memory tasks than children with NH. In the following, previous research on morphosyntactic development in children with CIs is discussed.

Morphosyntactic Development in Children With CIs
Whereas children with CIs score like children with NH on vocabulary tests, their acquisition of grammatical aspects of spoken language seems to be delayed (Boons et al., 2013;Caselli et al., 2012;Geers et al., 2003;Guasti et al., 2012;Nikolopoulos et al., 2004;Schorr, Roth, & Fox, 2008;Young & Killen, 2002). The morphological development of children with CIs deviates from children with NH, as it is strongly influenced by the perceptual prominence of the morphological forms (e.g., aquisition of perceptually more salient copula-"is" and "are"-before acquisition of noun plurals in English [Svirsky, Stallings, Lento, Ying, & Leonard, 2002] and the acquisition of inflectional morphology on nouns and verbs before acquisition of unstressed articles in German [Szagun, 2000]). Children with CIs make more inflectional errors on finite verbs than children with NH in their spontaneous speech (Hammer & Coene, 2016). Also their syntactic development deviates from children with NH. A general finding from standardized tests on grammar is that children with CIs, on average, perform roughly 1 SD below the means of children with NH (Nittrouer & Caldwell-Tarr, 2016). These standardized tests are useful to get an overall picture of children's performance on grammar, testing the comprehension of a small number of items with many different grammatical aspects, for example, verb morphology, passive structures, and prepositional phrases. In contrast, experimental studies examine children's performance on a specific grammatical aspect in more detail.
A specific problem is children's comprehension of syntactically complex structures, such as wh-questions, relative clauses, and topicalized sentences. These structures are complex, as the word order can deviate from the standard, canonical word order, which makes thematic role assignment (who is doing what to whom) difficult. The acquisition of these structures is late in typically developing children with NH, as discussed earlier, and even more so in children with hearing impairment.
DeLuca (2015) examined the comprehension of subject and object who-and which-questions with different levels of working memory demands in English-speaking children with CIs and children with NH. These questions were disambiguated by word order, as in English, unlike in German, the position of the verb differs between subject and object questions (see the English translations of (1)-(3)). Children with CIs (~69%) performed lower than children with NH (~83%) for all question types and conditions. They performed better on subject questions compared to object questions, on which-questions compared to who-questions, and on questions resulting in a low working memory load compared to those resulting in a high working memory load. However, a child's selection of the target picture corresponding to the wh-phrase did not reveal whether the child interpreted this wh-phrase as the subject or the object of the question.
Friedmann and Szterman (2011) investigated the comprehension and production of who-and which-questions of 9-to 12-year-old Hebrew-speaking children with hearing impairment (fitted with either hearing aids or CIs). These questions were disambiguated by accusative case marking provided by a free morpheme (et) before the object. Children with hearing impairment scored lower on subject and object which-questions, but not on subject and object who-questions, than children with NH who were 2 years younger. Furthermore, performance of children with CIs was lower on object which-questions (~70%) than on subject which-questions (~90%). Similarly, in another study, Friedmann and Szterman (2006) found that Hebrew-speaking children with hearing impairment made more errors in both comprehension and production of object relatives and topicalized object-verb-subject sentences than children with NH. Likewise, Volpato (2012) found that Italian children with CIs interpret object relatives disambiguated by verb agreement incorrectly as subject relatives more often than children with NH.
For German, Wimmer, Rothweiler, Hennies, Hess, and Penke (2015) investigated the comprehension of whoquestions in 3-to 4-year-old children with hearing impairment. These questions were disambiguated by case. Whereas the 3-year-olds pointed to the correct referent in the picture in 50% of the object question items, the 4-year-olds pointed to the correct referent in around 85% of the items. Nevertheless, it is difficult to interpret children's correct pointing as a correct comprehension of thematic role assignment, because only one interpretation was given in the task. Ruigendijk and Friedmann (2017) tested thematic role assignment explicitly. In this study, German children with hearing impairment (with hearing aids or CIs) of ages 9 to 13 years old were screened for auditory perception abilities. They showed that most-but not all-of those children had considerable difficulties in comprehension and repetition of sentences in which the object comes before the subject, including who-and which-questions.

Aims and Predictions
The first two research questions addressed in this study are (1) whether children with CIs perceive the morphosyntactic cues of case and verb agreement and (2) whether they use these cues for the interpretation of which-questions. Eye tracking is used to provide insight into children's online processing of which-questions and to be able to answer the third research question: (3) To what extent do morphosyntactic cues affect the processing of subject, object, and passive questions by children with CIs in comparison with children with NH? The processes leading to a certain interpretation can be revealed by gaze data. More specifically, the subtle morphosyntactic cues in object questions may lead to longer processing times for children with CIs compared to children with NH, especially for those sentences in which these cues distinguish between an object-first and subject-first interpretation. The processing times per sentence type and per disambiguation cue combination can be measured through eye tracking.
As pointed out above, a correlation has been found between the speech and language outcomes of children with CIs and their working memory capacity Kronenberger, Pisoni, Harris, et al., 2013;Pisoni et al., 2011). Sufficient working memory resources may be needed to revise an incorrect initial interpretation of a sentence, to keep a phrase in memory for a longer period, or to keep several interpretations active at the same time, all of which have been argued to be relevant for the interpretation of object questions. To examine whether possible group differences in the interpretation and processing of which-questions can be explained by working memory differences, we also measure the children's working memory capacity.
In the current study, first of all, it is important to establish whether children with CIs perceive morphosyntactic cues such as case and verb agreement. Children who cannot perceive these cues will obviously not use them. Instead, they will assign thematic roles linearly (see discussion above) and interpret object questions incorrectly as subject questions. Second, it is unclear whether children who do perceive these cues use them for thematic role assignment in processing which-questions. The syntactic development of children with a CI may be delayed, as their linguistic input is less in terms of years (auditory deprivation before implantation) and quality (degraded speech input from CI) compared to children with NH of the same age. If children do not use case and verb agreement and rely on word order instead, they will also assign thematic roles linearly and interpret object questions incorrectly as subject questions. Children who do use case and verb agreement will interpret object questions correctly. In this study, the use or nonuse of case and verb agreement does not influence the interpretation of passive questions such as (4), since in German passive questions, thematic role assignment is guided by different cues, namely passive verb morphology and the by-agent, which indicates who is doing what to whom. Children who do not rely on passive morphology and assign thematic roles linearly will interpret passive questions incorrectly as subject questions, whereas children who do rely on passive morphology will correctly interpret passive questions.

Method
Two screening tests and an eye-tracking experiment were conducted to examine whether children with CIs perceive and make use of morphosyntactic cues such as case and verb agreement when processing which-questions. In addition, to examine the influence of working memory on language comprehension, a digit span test was administered. Together, the tasks took about 1 hr. First, we will present the screening tests and the digit span test and next the main experiment examining the comprehension and processing of which-questions with eye tracking.

Participants
Thirty-three children with CIs between the age of 7 and 12 years were tested (15 boys, 18 girls; age range: 7;01-12;04 years;months, M = 9;07, SD = 18.1 months). These children were prelingually deaf, bilaterally implanted with their first CI before the age of 3;3, monolingual German, and otherwise typically developing. The mean age of first cochlear implantation was 1;4 (SD = 8.7 months), and the mean age of second implantation was 2;4 (SD = 15.2 months); the majority of the children (63%) was implanted simultaneously. Demographic data, including age, gender, age at implantation (first CI), age at implantation (second CI), duration of use of the first CI (i.e., hearing age), etiology, hearing aid experience (i.e., use of conventional hearing aids before implantation) and device can be found in Table A1 in the Appendix. As a control group, 36 typically developing monolingual children with NH and no diagnosed language or speech pathologies between the age of 7 and 10 years were tested (22 boys, 14 girls; age range: 7;05-10;09, M = 9;01, SD = 12.7 months). For a comparison of the comprehension of which-questions by these children with NH with that of adults with NH, see Schouwenaars et al. (2018). The children with CIs were matched with controls by (hearing) age and recruited at the Deutsches HörZentrum Hannover (German Hearing Center Hannover) and through the Landesbildungszentrum für Hörgeschädigte in Oldenburg (the states' educational center for the hearing impaired) and tested at the Cochlear Implant Center Wilhelm Hirte in Hannover and at the University of Oldenburg. All children with NH were recruited around the University of Oldenburg as well as through a regional newspaper advertisement and were tested at the University of Oldenburg. The children's legal representative gave written informed consent prior to the experiment. The study was approved by the Ethical Committee of the University of Oldenburg and the Hannover Medical School and in accordance with the declaration of Helsinki.

Screening Tests
Two screening tests were administered to select those children who perceive the difference between nominative and accusative case markings and perceive and understand verb agreement (i.e., the number information provided by the verbal inflection) in canonical sentences.

Auditory Discrimination of Case
The first screening test assessed children's discrimination of nominative and accusative case marking on the determiner. Single words or NPs were presented in an auditory discrimination test. The stimuli consisted of pairs of determiners (5), pairs of question words, or pairs of NPs (6), which were either the same (5) or different with respect to case (6). See Table A2 in the Appendix.
(5) der-der (6) welcher Hund-welchen Hund The participants were asked to press a button marked with the text gleich (the same) when the two words or NPs were the same and the button marked with the text nicht gleich (not the same) when they were different. In total, 16 pairs were presented-eight per condition (same vs. different).

Perception and Comprehension of Verb Agreement
The second screening test examined children's perception and understanding of verb agreement in declarative sentences. Here, a picture selection task was used in which a pair of pictures was presented on the screen, and at the same time, a prerecorded sentence was presented acoustically. The children's task was to select the picture that best matched the sentence. Sentences such as the following were used: Sie malt/malen die Prinzessin. pronoun SG/PL paint SG /paint PL the princess. "She/They paint(s) the princess." The ambiguous German pronoun sie can refer either to a singular feminine referent ("she") or a plural referent ("they"). Therefore, the number of the subject referent is solely determined by the number marking on the finite verb in these sentences. In each picture pair, one picture corresponded to a singular interpretation of the subject, and the other one corresponded to a plural interpretation of the subject (see Figure 1). The position of the subject referent on the pictures and the position of the target picture (left or right) were balanced over four lists. The screening test consisted of a total of 16 items (eight in the singular condition, eight in the plural condition), with four reversible transitive verbs (filmen "to film," fangen "to catch," malen "to paint," waschen "to wash"). The third-person singular form is formed by stem + t for the verbs filmen and malen and by stem + t and additionally a vowel change for the verbs fangen and waschen (third-person singular forms are fängt and wäscht). The latter may be more salient and therefore easier to distinguish from the plural form ( fangen and waschen).

Digit Span Test
To examine the influence of working memory on children's comprehension of which-questions, a digit span test (Hamburg-Wechsler-Intelligenztest für Kinder-Vierte Auflage [Hamburg-Wechsler Intelligence Test for Children-Fourth Edition]; Petermann & Petermann, 2007) was administered, which consisted of two parts: forward and backward. The children were asked to repeat a sequence of digits ranging from 1 to 9, which was read out loud by the experimenter, in the presented order (forward) or in the reversed order (backward). The session of the forward condition started with a sequence of three digits, and the session of the backward condition started with a sequence of two digits. Each sequence length contained two trials, after which the sequence increased with one more digit. When both trials of the same length were recalled incorrectly, the test ended. The span (number of digits of longest sequence recalled correctly) was calculated as a measure of digit span. The forward span test was used as an introduction to the task. Only the backward digit span was used for the analyses, because besides temporary storage (remembering the digits) it also requires manipulation of information (reordering the digits) and hence is considered a measure of working memory (Baddeley, 2003).

Stimuli and Procedure
A picture selection task and eye tracking were used to test the comprehension and processing of three different types of which-questions: subject which-questions, object which-questions, and passive which-questions (see (8)-(16) in Table 1).
The passive questions were always disambiguated by passive voice (e.g., the verb werden "to be," the past participle, and the by-phrase). The subject and object questions were disambiguated by different cues: only case (Case), only verb agreement (Agr), or both (AgrCa), resulting in six conditions in total. The differences between these conditions were realized by the gender and number of the nouns that were used. In German, the determiner of singular masculine nouns differed between nominative (der) and accusative (den) case marking, whereas the determiners of feminine and plural nouns were the same (in both cases die). In the first condition, Case, singular masculine NPs were used to allow case disambiguation on the initial wh-phrase and the second NP. In this condition, verb agreement was not available as a cue, because both nouns were singular (see (8) and (11)). In the second condition, Agr, feminine noun pairs were used, so case was not available as a cue. Instead, verb agreement was, because the first noun was singular and the second noun was plural (see (9) and (12)). In the third condition, AgrCa, the first noun was masculine plural and the second was masculine singular to allow verb agreement cue on the verb and the case cue on the second noun (see (10) and (13)). The different cue conditions also led to differences with respect to timing: the Case condition was disambiguated early in the sentence, namely on the first NP, whereas the Agr and AgrCa conditions were disambiguated later in the sentence, namely at the verb (see (8) vs. (9) and (10)).
For passive questions, the same noun pairs were used as for active questions. Therefore, also for the passive questions, we had three different types with different noun types (Pas(a): two masculine singular nouns, see (14); or Pas(b): a feminine singular and a feminine plural noun, see (15); or Pas(c): a masculine plural and a masculine singular noun, see (16)). Nevertheless, for passive sentences, these different nouns did not lead to a distinction with respect to type of disambiguation cue, as in active sentences. The passive Figure 1. Example of a picture pair, the left picture matching the single-subject interpretation (left) and the right picture matching the pluralsubject interpretation of sentence (7): Sie malt/malen die Prinzessin "she/they paint(s) the princess." questions were consistently disambiguated by passive morphology instead.
All sentences were recorded by a female native speaker of German. The recordings took place in a sound attenuated booth with the use of a Neumann KM 184 cardioid microphone, an RME Fireface UC audio interface, and Adobe Audition recording software. We made four lists that differed in item order and in position of the target picture (left or right). Fifty-four test items were presented in total: six for every row in Table 1, leading to 18 items per question type. We used six different transitive verbs of which the third-person singular was formed by a vowel change plus -t, so the singular and plural forms were as distinctive as possible (e.g., fängt-fangen "catches-catch"). We used 15 different noun pairs across all types of questions. As thematic role assignment may be influenced by semantic properties of the nouns, the plausibility of the noun pairs was controlled for in a pilot questionnaire study (i.e., a donkey washing a panda was judged as plausible as vice versa).
For each trial, two pictures were presented next to each other. The pictures represented different interpretations: a correct interpretation and an incorrect interpretation resulting from role reversal. For example, the left picture in Figure 2 represents the correct object question interpretation of sentence (13). In the right picture, the roles are reversed, representing the incorrect subject question interpretation of sentence (13).
In a familiarization phase, the participants were presented with a picture pair for 2,500 ms, which was followed by a fixation cross. After fixating the cross for 500 ms, the picture pair reappeared on the screen, followed by the prerecorded sentence presented 50 ms later. The participants had to press the button corresponding to the picture they thought best matched the sentence. The test items were divided into two blocks of 27 test items each, both preceded by two practice items and containing seven filler items with one animate noun and one inanimate noun (e.g., "Which kangaroo is shooting the ball?"). The experiment started with the digit span task. In between the two blocks of the comprehension task, the verb agreement screening test described above was carried out. The auditory discrimination task was carried out after the second block.
A Tobii TX300 eye tracker was used with a twocomputer setup. One computer ran the experiment with the E-Prime 2.0 software (Psychological Software Tools, Inc.) and collected the behavioral data. By means of E-Prime Note. SVO = subject-verb-object; Case = case disambiguation; Agr = agreement disambiguation; AgrCa = agreement and case disambiguaton; OVS = object-verb-subject; Pas = passive question.
Extensions for Tobii (TET calls), the participants' eye movements were collected from the second computer at a sample rate of 300 Hz.

Generalized Linear Mixed-Effects Models
The accuracy data of the screening tests and the which-questions comprehension test were analyzed by generalized linear mixed-effects regression modeling (GLMM) with the software R (Version 3.1.2). The accuracy models contain a binomial dependent variable with a logit link function of item accuracy. Participant and item were included as random intercepts. One by one, fixed factors were included to see whether they improved the goodness of fit of the model. An improvement was assessed by comparing the Akaike information criterion (AIC) score (Akaike, 1974) of the new model, which included the fixed factor under examination, with that of the previous model, which did not include the fixed factor but was otherwise identical. A decrease of at least 2 in the AIC scores indicates that the inclusion of a factor significantly improves the goodness of fit of the model. Similarly, the necessity of including random slopes (e.g., a by-subject random slope for type of question) was assessed, but none were included as they did not improve the model.

Preprocessing of the Gaze Data
Gaze data validity was checked, and only data points that were rated by the eye tracker with a value of "0" (all relevant data for both eyes were recorded) and "1" (highly probable estimations for one eye were recorded) were included. No participants nor trials had to be excluded due to insufficient (< 75%) valid data points. Both correct and incorrect trials were included in the analyses, as the goal was not to investigate how children process which-questions when they interpret them correctly. Instead, we wanted to investigate how these groups of children process which-questions in general. The time window for the gaze data started at the onset of the stimulus and lasted for 3,000 ms to cover the complete range of time from the beginning of the sentence until the average response time. Areas of interest (AOIs) were determined for target interpretation (target picture), competitor interpretation (competitor picture), and not on AOI. The sum of looks to a particular AOI was calculated per participant per trial and per time bin of 200 ms for the statistical analysis. For the gaze plots, time bins of 50 ms were calculated for a more detailed picture.

Generalized Additive Mixed Modeling
For the analyses of the gaze data, we used generalized additive mixed modeling (GAMM; Wood, 2006Wood, , 2011 in R with the packages mgcv 1.8.4 (Wood, 2006) and itsadug (van Rij, Wieling, Baayen, & van Rijn, 2017). GAMMs are particularly useful for eye tracking and other time course data, because they can fit nonlinear trends over time (cf. Nixon, van Rij, Mok, Baayen, & Chen, 2016;Porretta, Tucker, & Järvikivi, 2016;van Rij, Hollebrandse, & Hendriks, 2016; for introduction in GAMMs and how to deal with autocorrelation in linguistic time series data, see Porretta, Kyröläinen, van Rij, & Järvikivi, 2017;Winter & Wieling, 2016). Like GLMMs, GAMMs also allow for inclusion of random factors reducing autocorrelation. A crucial difference with GLMMs is that GAMMs manage data sets that are nonlinear, such as our gaze data, which change over time. Smooth functions model the relations between the fixed and random factors on one side and the dependent variable on the other. Estimation procedures determine the smooth functions and parameters to rule out overfitting and overgeneralization of the data Wood, 2006).
A dependent variable was made by calculating the difference between the sum of looks toward the target minus the sum of looks toward the competitor picture for time bins of 200 ms. In GAMMs, interactions are modeled by using a combined factor (e.g., Porretta et al., 2017;van Rij et al., 2016). For example, group and type of question were combined into one predictor to see whether there were group differences (a 99% confidence interval was used) for each type of question. Item was not included as a random effect factor as it would have increased the run time of the model enormously (which was already 12 hr). Instead, we combined participant and type of question into one random effect factor (ParticipantQuestion) and added this to the  (13): Welche Füchse wäscht der Schwan "Which foxes is the swan washing?" model. Two analysis methods were used to test for effects: a model comparison procedure and analyzing plots of the model predictions.

Results
First, the analyses of the screening tests will be presented to see whether children were able to perceive case and verb agreement and which factors influenced children's perception of case and verb agreement. Second, offline accuracy scores on the which-questions test will be presented and will show us the final interpretation given to which-questions by the children who passed the screening tests. Third, the online gaze data will be presented, which inform us about the processing during sentence presentation, that is, which interpretations are considered during processing.

Case
Of the 33 children with CIs, 25 passed the case screening test on a criterion of 14 or more out of 16 correct (n = 33, M = 90.5% correct, SD = 2.75, 12 children made no mistakes, 10 children made one mistake, three children made two mistakes, and eight children made three to six mistakes). Of the 36 children with NH, one child did not pass this test on a criterion of 14 or more out of 16 correct (n = 36, M = 97.4% correct, SD = 4.51, 25 children made no mistakes, eight children made one mistake, two children made two mistakes, and one child made three mistakes). It can be assumed that all children who passed this test could perceive the differences between the different case morphologies on articles and wh-words.

Verb Agreement
Of the 33 children with CIs, nine children failed the verb agreement screening test on a criterion of 14 out of 16 correct (n = 33, M = 86% correct, SD = 3.56, 13 children made no mistakes, six children made one mistake, two children made two mistakes, and nine children made three to 10 mistakes). Of the 36 children with NH, one child (not the same child who failed the case screening test) failed this screening test (n = 36, M = 96% correct, SD = 4.92, 19 children made no mistakes, 12 children made one mistake, four children made two mistakes, and one child made three mistakes). For the children who passed the test, we can be sure that they are sensitive to the number information provided by the verbal inflection, because that was the only cue to select the correct picture.
In order to examine which factors influence the performance on the screening tests of children with CIs, a GLMM model was made with accuracy data of both screening tests as a binomial dependent variable and participant and item as random intercepts. The following factors improved the goodness of fit of the model: test (case screening or verb agreement screening), chronological age, age at implantation, and hearing aid experience. The factor of simultaneous versus sequential binaural implantation did not improve the model, and no interactions were found. Table 2 shows the final model resulting from the analysis. This summary shows that there is a marginally significant effect of test (slightly higher accuracy scores on the case screening test than on the verb agreement screening test) and an effect of chronological age, age at implantation, and hearing aid experience. Older age at testing, a younger age at implantation, and the presence of hearing aid experience improved the children's performance on both screening tests (see Figure A1 in the Appendix for performance per participant).
In total, 21 children with CIs (12 boys, nine girls; age range: 7;5-12;4, M = 9;11, SD = 17.8 months) and 34 children with NH (21 boys, 13 girls; age range: 7;05-10;09, M = 9;01, SD = 12.7 months) passed both screening tests. Only their data on the which-question task were further analyzed to ensure that possible errors in children's comprehension of which-questions are not due to lack of perception of case marking and/or verb agreement.
Offline Accuracy Data Figure 3 shows the percentage of correct interpretations of which-questions for children with CIs (left) and children with NH (right) who passed the screening tests. Both groups of children score at ceiling on subject and passive questions. Scores on object questions are lower, especially for children with CIs. There is no clear effect of cue type.
To compare the groups, a GLMM model was made with item accuracy as the dependent variable and participant and item as random intercepts. In the presented order, the following factors were included to see whether they improved the goodness of fit of the model: group (children with CIs vs. children with NH), type of question (subject vs. object vs. passive), hearing age (for children with CIs: chronological age minus age at first implantation, for children with NH: chronological age), and type of cue (Case vs. Agr vs. AgrCa). Note that chronological age was also a significant predictor, but because hearing age and chronological age correlate, they both cannot be included in the same model. Hearing age is included in this model as it has a lower p value and leads to lower AIC score and therefore is a better predictor than chronological age. The inclusion of type of cue (valid factor for subject and object questions only) did not improve the model. Also, no interactions for this variable with group or type of question were found. In additional models not reported here, we examined the possible effects of the material-related variables, such as verb, pair of nouns, session, direction of action, and position of target. These effects were not significant, and none of these variables improved the model. Table 3 shows the final model for the overall analysis. There is an effect of type of question and of hearing age and an interaction between group and type of question.
The factor type of question consists of three levels. In order to see whether there is a significant difference between all three types of questions, a multiple comparison is made with the use of the "multcomp" package (Hothorn, Bretz, & Westfall, 2008). The results show that there is a significant difference in accuracy between subject questions and object questions (β = −3.43, z = −7.411, p < .001) and between passive questions and object questions (β = 2.88, z = 7.540, p < .001), but not between passive questions and subject questions (β = −0.54, z = −0.997, p = .57). Thus, overall object questions were comprehended less well than subject questions and passive questions.
We also found an interaction between group and type of question. To test whether children with CIs perform less accurately than children with NH for all question types or whether this difference in performance is limited to a certain type of question, a multiple comparison was carried out. We found that the difference between the groups only holds for object questions and not for subject questions (β = 0.16, z = 0.239, p = .99) or passive questions (β = 0.70, z = 1.166, p = .83): Children with CIs score significantly lower than children with NH on object questions (β = 1.28, z = 2.988, p < .05). So children with CIs show more difficulty on the comprehension of object questions than children with NH.

Digit Span
The two groups were compared regarding their backward digit span scores and their performance on object questions to see whether children's working memory differs between the groups of children and whether it is related to their comprehension of object questions. Figure 4 shows children's accuracy scores on object questions per backward digit span and per group.
A new GLMM model was made with item accuracy (of object questions only) as a dependent variable and group and backward digit span as independent variables. Participant and item were included as random effect factors. There was a significant effect of backward digit span and no interaction: Backward digit span was a significant predictor of children's mean accuracy scores on object questions (β = 0.9528, z = 2.822, p < .01). Thus, overall digit span scores were related to children's mean accuracy scores on object questions: The higher their backward digit span score, the better their accuracy on object questions.
Note that once backward digit span is included in the model, group is no longer a significant factor. This is because group and backward digit span correlate. A linear model with backward digit span as a dependent variable and group as an independent variable shows that children with NH tend to have a significantly higher backward digit span than children with CIs (β = 0.7895, t = 3.526, p < .001).

Other Factors
There was much variation within the group of children with CIs. Two children with CIs interpreted all object questions correctly, 11 children scored between 70% and 95% correct, and eight children scored lower than 67%  correct. Part of the variation could be explained by working memory capacities and hearing age. The role of children's performance on the case and agreement screening tasks was examined, but these factors were not significant. The role of other child-related demographic factors was examined as well, but also these factors were not significant. For example, age at implantation, simultaneous versus sequential bilateral implantation, hearing aid experience, hearing levels before implantation, and implant device were all insignificant predictors of the performance on the comprehension test of children with CIs. Possibly, the range of variability for some factors was too low to find significant effects. For instance, half of the children were implanted between 6 and 12 months, and only three children did not have hearing aid experience. Summarizing, the offline data show that children with CIs made significantly more errors than children with NH in their comprehension of object questions, but not of subject or passive questions. Hearing age and digit span were related to the comprehension of object questions. The higher the hearing age and the higher the digit span score, the better their accuracy scores on object questions. No differences were found with regard to the disambiguation cues. The eye gaze data, being a more precise measure over time, may tell us more about the online processing of which-questions and the influence of the different disambiguation cues.

Gaze Data
Children with CIs may process words and linguistic cues that they encounter differently from children with NH. Sentence processing starts at the beginning of a sentence. The interpretation of the sentence may change during processing as words are encountered one by one. Therefore, an initial interpretation may differ from a final interpretation. In the absence of morphosyntactic cues, initial interpretations are driven by word order. Therefore, more looks at the picture corresponding to the subject question interpretation are predicted. For object questions, this initial interpretation needs to be revised based on morphosyntactic cues. Whether and when children with CIs revise the initial incorrect interpretation may differ from children with NH. Changes in interpretation during processing and differences in processing between groups can be examined by gaze data. The plots in Figure 5 show the gaze data of children with CIs and children with NH during their processing of which-questions. Note. NH = normal hearing; obj = object; pas = passive. *p < .05. **p < .001. The bar plot shows that children with a higher digit span tend to have higher accuracy scores on object questions and that children with cochlear implants (CIs) tend to have lower digit span scores than children with normal hearing (NH). The number of participants for the corresponding digit span is indicated in brackets.
The gaze plots in Figure 5 show that, for subject questions, both groups of children look increasingly toward the target picture. For object questions, first, the looks toward the competitor picture increase at the cost of the looks toward the target picture. Only later, the looks toward the target picture increase. The increase of looks toward the target picture is earlier and ends up higher for children with NH than for children with CIs. Also for passive questions, first looks toward the competitor picture increase, and at a later moment in time, looks toward the target picture increase. Here the lines of children with CIs and children with NH seem closer to one another. In the following, we describe the statistical model used to support similarities and differences between the two groups for subject, object, and passive questions. In a later analysis, we will look at similarities and differences with respect to type of cue for object questions.

Type of Question
For our first GAMM model, a dependent variable was made calculating the difference between the sum of looks toward the target minus the sum of looks toward the competitor picture for time bins of 200 ms. The random effect factor of ParticipantQuestion was added to the model. One predictor was made that contained all combinations of interactions between group (children with CIs vs. children with NH) and type of question (subject vs. object vs. passive) to see whether there were group differences for each type of question. Differences in gaze patterns due to hearing age or offline accuracy were also investigated by including these predictors in the model. However, this further division of the data reduced the number of data points per comparison and did not improve the model fit, so these predictors were removed from the analysis.
Difference plots and the function get_differences from the itsadug package  were used to investigate whether there were significant differences between the groups. The difference plots reveal that there are differences between the gaze patterns of children with CIs and those of children with NH for object questions, but not for subject questions and passive questions (see Supplemental Material). For object questions, the looks toward the target pictures increase later and less steeply for children with CIs than for children with NH. This indicates that children with CIs needed more time to revise the incorrect interpretation and were less certain than children with NH.

Disambiguation Cue
We performed a second analysis to see whether children's gaze patterns differed with respect to disambiguation cue. The gaze patterns for object questions per type of cue are visualized for both groups of children separately (see Figure 6). Both groups of children initially show a preference for the incorrect interpretation for the AgrCa and Agr conditions (more looks toward the competitor picture than toward the target picture before the sentence offset). For children with CIs, we see a less clear but similar preference for the incorrect interpretation for the Case condition, whereas children with NH do not show a preference for the incorrect interpretation for the Case condition at all. Rather, their looks toward the target picture increase slowly but immediately.
A second GAMM model was made with the same dependent variable as in the first GAMM model, but including only the data points of the object questions. Participant was included as a random effect factor. One predictor was made containing all combinations of interactions between group (children with CIs vs. children with NH) and type of Figure 5. Children with cochlear implants' (CIs; dashed line) and children with normal hearing's (NH; solid line) online gaze behavior for subject questions (left), object questions (middle), and passive questions (right). The plots show separate lines for looks toward the target picture (red lines) and looks toward the competitor picture (blue lines). The vertical lines indicate the onset of the verb, the onset of the second noun phrase (NP)/past participle (PP) and the offset of the sentence. The horizontal gray lines indicate a significant difference between children with CIs' and children with NH's gaze patterns analyzed with the statistical model described in the GAMM section.
cue (Case vs. Agr vs. AgrCa) to see whether there were differences within and between the groups for each type of disambiguation cue.
To see whether the observed differences were significant, difference plots were made (see Supplemental Material). For children with CIs, there were no significant differences in looks between object questions disambiguated by Case, Agr, or AgrCa. In all conditions, they initially look toward the competitor picture and later toward the target picture. For children with NH, there were significant differences between object questions disambiguated by Case and the other two conditions Agr and AgrCa. Whereas for AgrCa and Agr, children with NH initially look toward the competitor picture and later toward the target picture as children with CIs do, for the Case condition, they do not look toward the competitor picture and look toward the target picture earlier. So for the Case condition, the gaze pattern of children with CIs differed significantly from that of children with NH.

Word Order
A third analysis was performed in which a comparison was made for children with CIs between subject questions and object questions with respect to the looks toward the agent-first picture. Because the gaze plots in Figure 6 suggest that children with CIs do not pick up the case disambiguation cue on the first NP, it is interesting to take a closer look at the gaze patterns of other questions that are disambiguated early in the sentence. If children ignore or do not perceive the case cue on the wh-phrase, the gaze patterns of subject questions disambiguated by nominative case (welcher) should initially be the same as the gaze patterns of object questions disambiguated by accusative case (welchen). In other words, for both types of questions, the interpretation should be driven by word order, and therefore, the same increase of looks toward the agent-first picture is expected for subject questions as for object questions. The plots in Figure 7 show the gaze data of children with CIs for subject questions and object questions per type of cue.
Again, a GAMM model was made to analyze whether the gaze patterns between subject questions and object questions were initially the same or different. For this model, only the data points of the subject questions and object questions of children with CIs were included. We were interested in looks toward the agent-first pictures, because those are the pictures at which children are expected to look if they ignore or do not perceive case morphology and base their interpretation on word order only. Therefore, as a dependent variable, the difference between the sum of looks toward the agent-first picture minus the sum of looks toward the patient-first picture for time bins of 200 ms was calculated. One predictor was made containing all combinations of interactions between type of question (subject questions vs. object questions) and type of cue (Case vs. Agr vs. AgrCa) to see whether there were early differences between subject and object questions for each type of disambiguation cue. As a random effect factor, participant was included.
In the Case condition, the looks toward the agentfirst picture increase in object questions, but for a shorter period than in subject questions (see also Figure 7). For questions disambiguated by Agr and AgrCa, the two lines of looks toward the agent-first picture in subject questions diverge at a later moment in the sentence, just before the offset of the sentence. Summarizing, the online gaze data show that the interpretation of both groups of children changes from an agent-initial interpretation to a patient- Figure 6. Children with cochlear implants' (CIs; left plot) and children with normal hearing's (NH; right plot) online gaze behavior for object questions. The plots show separate lines for looks toward the target picture (red lines) and competitor picture (blue lines) per type of cue: Case (solid lines), Agr (dashed lines), and AgrCa (dotted lines). Case = case disambiguation; Agr = agreement disambiguation; AgrCa = agreement and case disambiguation. The vertical lines indicate the onset of the verb (V), the onset of the second noun phrase (NP), and the offset of the sentence. The gray horizontal lines indicate a significant difference between the types of cues (only for children with NH significant differences were found).

Schouwenaars et al.:
Processing of Wh-Questions by Children With CIs 399 initial interpretation during the processing of object questions and passive questions. Children with CIs were slower than children with NH in revising their initial interpretation for object questions, but not for passive questions. Moreover, significant differences were found with respect to disambiguation cue for children with NH. They interpreted object questions that were disambiguated by verb agreement only or also by case on the second NP initially as subject questions, but they did not do so for object questions that were disambiguated by case on the first NP. Children with CIs interpreted not only object questions disambiguated by verb agreement only or also by case on the second NP initially as subject questions but also object questions disambiguated by case on the first NP. In this respect, children with CIs' gaze patterns differ from those of children with NH. However, children with CIs did not completely ignore the case cue on the first NP, as looks toward the agent-first picture were different for subject questions than for object questions already before the onset of the second cue.

Discussion
In this section, we will discuss the three research questions of this study one by one.

Perception of Morphosyntactic Cues
First we investigated whether children with CIs perceive the morphosyntactic cues of case and verb agreement. We hypothesized that children with CIs perceive morphosyntactic cues less well than children with NH, because these cues are perceptually not very salient. Case forms such as der "the NOM " and den "the ACC " or welcher "which NOM " and welchen "which ACC " have been argued to be difficult to discriminate perceptually; the same holds for agreement morphology with respect to number (e.g., the third-person singular -s morpheme) on the verb (Hennies, Penke, Rothweiler, Wimmer, & Hess, 2012).
Based on the results of the case screening task, we found that eight children with CIs made between three and six errors out of 16 items. These children with CIs failed to perceive the difference in case between two distinct items of a pair, as most of these errors were in the condition in which the pairs differed with respect to case. Based on the results of the verb agreement screening test, we found that nine children with CIs made between three and 10 errors out of 16 items. Five children failed on both screening tests, and seven failed on either the case or the verb agreement screening test. So in total, 12 out of the 33 children with CIs had problems perceiving the morphosyntactic cues.
Factors that influenced the perception of morphosyntactic cues of children with CIs were chronological age, age at implantation, and hearing aid experience. In general, the perception of morphosyntactic cues was lower for younger children than for older children. This suggests that younger children who have problems perceiving the cues may overcome these problems over time. Furthermore, the perception of children that were implanted at a younger age was better than that of children that were implanted at an older age. This is in line with previous research that found age-at-implantation effects (e.g., Kirk et al., 2002;Nicholas & Geers, 2007;Lesinski-Schiedat et al., 2004;Tomblin et al., 2005). This supports the idea that early implantation leads to better linguistic input in the sensitive period for phonological development (Ruben, 1997), improving phonological perception. But note that recent studies have shown that the sensitive period for auditory perception Figure 7. Children with cochlear implants' (CIs) online gaze behavior for subject questions (dotted lines) and object questions (solid lines) that are disambiguated by case, verb agreement, or verb agreement and case (from left to right). The plots show separate lines for looks toward the agent-first picture (orange lines) and looks toward the patient-first picture (purple lines). The vertical lines indicate the onset of the verb (V), the onset of the second noun phrase (NP), and the offset of the sentence. The horizontal gray lines at the bottom of the graphs indicate a significant difference between gaze patterns of subject questions and gaze patterns of object questions analyzed with the statistical model described in the GAMM section. Case = case disambiguation; Agr = agreement disambiguation; AgrCa = agreement and case disambiguation; Subj qs = subject question; Obj qs = object question.
in children with CIs lies in the first 3-4 years, when the central auditory pathways show maximal plasticity (see Kral & Sharma, 2012, for an overview). Because the children in our study were all implanted before the age of 3 years, based on these studies, the children should have caught up. It is not unlikely that the lack of input in children with a CI not only causes a delay in auditory development but consequently also affects higher-level linguistic development.
A third main effect was found for hearing aid experience. In the total sample of 33 children, only five children did not have hearing aid experience. On average, these five children performed lower than the children who did have hearing aid experience. This too supports the idea of a sensitive period for phonological development. However, this suggestion needs to be taken with caution, because the number of children who did not have hearing aid experience is small and because the reason why they did not receive a hearing aid is not completely clear. It is likely that they did not use a hearing aid because their perception abilities were already too low and that it is this lower hearing ability that could explain the effect.
The number of children who failed on the screening tests is considerably higher in the group of children with CIs (12 out of 33) than in the group of children with NH (2 out of 36). 1 In summary, two thirds of the children with CIs in our study could perceive morphosyntactic cues, whereas about one third of our group of children with CIs already have problems with morphosyntactic cues on a perceptual level. An impaired perception of morphosyntactic cues will likely further affect the grammatical development of these children (Caselli et al., 2012;Geers et al., 2003;Nikolopoulos et al., 2004;Schorr et al., 2008;Young & Killen, 2002).

Use of Morphosyntactic Cues for the Interpretation of Which-Questions
Children's final offline interpretations answer the question of whether children with CIs make use of morphosyntactic cues for their interpretation of which-questions to the same extent as children with NH. We hypothesized that children with CIs make less use of morphosyntactic cues than children with NH, even though this selected group can perceive these cues (as assessed with the screening tasks), because their grammatical development may be delayed or impaired due to insufficient linguistic input in terms of quantity and quality. If children make use of disambiguation cues such as case and verb agreement, they will interpret both subject questions and object questions correctly, as well as passive questions. If children do not make use of these morphosyntactic cues but instead rely on other information such as word order, we expect interpretation errors on object questions but not on subject questions, as object questions violate the agent-first bias and subject questions can be interpreted correctly with an agent-first strategy. Like object questions, passive questions also violate the agent-first bias. If children always rely on word order, we expect interpretation errors on passive questions as well. If instead children do rely on passive morphology to overwrite the agent-first bias, we expect children to correctly interpret passive questions.
Based on accuracy scores on the picture selection task, we found that children with CIs, who could perceive the morphosyntactic cues, still make less use of morphosyntactic cues than children with NH. Whereas they perform at ceiling on subject questions and passive questions, like children with NH, they show lower performance on object questions than children with NH (72% vs. 88% correct). Children with CIs therefore relied more on word order than on morphosyntactic cues when interpreting object questions. However, they did not strongly rely on word order in all question types in which the agent does not come first, as passive questions were correctly interpreted (which is in line with results reported in Ruigendijk & Friedmann, 2017).
The explanation that children with CIs misinterpret object questions because they do not perceive the relevant cues is unlikely, as these children were screened on their perception of case and verb agreement. Furthermore, the performance on the case screening task and agreement screening task did not predict the performance on the object questions. Thus, children with CIs who can perceive the morphosyntactic cues are not always able to use them successfully. Perception in phrases consisting of a determiner and a noun only, as tested in the case screening test, may, however, be crucially different from perception of the same information in a complete sentence, as in the which-questions tested in our main experiment. The screening test of verb agreement did consist of complete sentences. But, these sentences were simple declarative sentences with a canonical word order. Even if the perception of the cues per se is not impaired, this does not mean that this perception can be successfully used during sentence interpretation. Problems in role assignment affecting sentence interpretation by children with hearing impairments have been reported in previous studies for various languages (Friedmann & Szterman, 2006Ruigendijk & Friedmann, 2017;Volpato, 2012).
Another explanation for the interpretation problems by children with CIs may be the lack of language input during a so-called sensitive period. The sensitive period for phonological development is argued to end around the end of first year of life, whereas the sensitive period for syntactic development is argued to end around age 15 to 16 (Neville, Mills, & Lawson, 1992;Ruben, 1997; but see Friedmann & Szterman, 2011, who suggest that the first year of the child's life is crucial for syntactic development, and see Kral & Sharma, 2012, who argue that the first 3 to 4 years are crucial for auditory perception; note that they did not examine phonological development per se). Nevertheless, the periods are interdependent, in the sense that insufficient input of phonology in the very early years of life may result in impaired syntactic development (see above). Our results are in line with previous research that has found that limited 1 The two children with normal hearing who did not pass the screening tests both made three errors out of 16 items in one of the screening tests.

Schouwenaars et al.:
Processing of Wh-Questions by Children With CIs 401 exposure to language during the first years of life can have drawbacks for the acquisition of syntax by children with hearing impairments (de Villiers, de Villiers, & Hoban, 1994;Delage & Tuller, 2007;Friedmann & Szterman, 2006;Geers & Moog, 1978). Note, however, that the children in the current study were implanted at a younger age than in most of the studies mentioned above.
Related to the sensitive period discussion is the age at implantation effect found for the performance on the screening test in the current study and in previous studies (Kirk et al., 2002;Nicholas & Geers, 2007;Lesinski-Schiedat et al., 2004;Tomblin et al., 2005). Children who are implanted after the sensitive period may have more problems with language. We did not find an age-at-implantation effect for the performance on the which-questions comprehension task. However, we did find an effect for hearing age (i.e., the chronological age minus the age at implantation), which indirectly is related to age at implantation as the children who were implanted at a later age had a lower hearing age. Moreover, the children were selected on their performance on the screening tests, for which age at implantation was a significant factor. It is possible that this explains why we did not find an age at implantation effect for the performance on which-questions.
As briefly mentioned above, hearing age was a significant predictor for children's comprehension of whichquestions. For children with NH, this means that the older the child, the better their comprehension of which-questions (see also Metz et al., 2010, for Dutch which-questions). For children with CIs, this means that the longer the exposure time to linguistic input, the better their comprehension of which-questions. Note that the children with longer exposure to linguistic input, in general, also were older. Children with a lower hearing age have had less language experience. However, the question remains whether and how this might relate to the lower accuracy scores on object questions. One explanation would be that these children have encountered fewer sentences in which the object precedes the subject or fewer sentences in which case or verb agreement carries important distinctive information for the meaning of a sentence. The effect of hearing age we found suggests that younger children or children who had less years of hearing experience may get a better comprehension of object questions over time. Note that effects of hearing age and age at implantation are difficult to disentangle, especially in relatively small groups. Therefore, it could be either hearing age, age at implantation, or both that explains performance on the comprehension task.
Our results suggest that children's comprehension of object questions is associated with working memory as measured by backward digit span. Children with a lower digit span misinterpreted object questions more often than children with a higher digit span. Overall, children with CIs had a lower digit span than children with NH. This group difference is in line with previous studies that have found that children with CIs score lower on working memory and other executive functioning tasks (e.g., van Wieringen & Wouters, 2015), not only on auditory verbal tasks but also on visual nonverbal tasks (Kronenberger, Pisoni, Henning, & Colson, 2013). The children with CIs may score lower on the working memory task because they may have deficits in domain-general sequential processing (Conway, Pisoni, & Kronenberger, 2009) or because their verbal representations may be underspecified, which affects general processing efficiency and speed and thus working memory (Pisoni et al., 2011). Although we found a correlation between working memory and processing object questions in children with CIs, our findings do not allow us to distinguish between a scenario in which the children's difficulty with the interpretation of object questions arises as a result of working memory limitations, or the inverse scenario in which the children's working memory limitations arise as a result of reduced access to auditory experiences (Conway et al., 2009). More research is needed to identify the exact mechanisms underlying the lower working memory scores of children with CIs.
Working memory has been argued to be important for revising an incorrect sentence interpretation (Booth, MacWhinney, & Harasaki, 2000;Roberts, Marinis, Felser, & Clahsen, 2007). German children with low digit span scores have more difficulties in comprehension of object relative clauses than children with high digit span scores (Arosio et al., 2012). Our results support the idea that working memory is related to the ability to revise an incorrect interpretation in object questions. However, a high working memory is not necessary for the ability to revise an incorrect interpretation per se. Children with smaller working memory capacities do not have problems revising an incorrect interpretation in passive questions and score above 95% correct (see also the discussion below). This is an important issue for future research.
The good comprehension of passive sentence structures is in line with previous research on German children with hearing impairment (Ruigendijk & Friedmann, 2017), but in contrast with previous research on English children with hearing impairment (Nolen & Wilbur, 1985;Power & Quigley, 1973). The structure of passive sentences and the number and perceptual saliency of its cues are similar between German and English and unlikely to cause differences in comprehension. More likely, the differences in performance between the German and English studies are due to a higher degree of hearing loss of the participants in the English studies than in the German studies. Also the English studies are much older, and since then, CI technology has been further developed, which makes it hard to directly compare these studies to recent studies.

Online Processing of Subject, Object, and Passive Questions
The third research question in this study was to what extent morphosyntactic cues affect the processing of subject, object, and passive questions by children with CIs in comparison with children with NH. We hypothesized that children with CIs, who can perceive the morphosyntactic information, rely more on word order cues than on morphosyntactic cues and therefore process object questions differently from children with NH. We expected them to show stronger preferences for the agent-first interpretation and delayed looks toward the target picture in object questions.
The eye gaze data confirm the hypothesis and show differences between children with CIs and children with NH in the processing of object questions. Whereas children with CIs process subject questions and passive questions in the same manner as children with NH, they process object questions differently from children with NH, although they could perceive the relevant morphosyntactic cues and their offline interpretation was still quite good.
One way in which they differ is the timing. In object questions in general, the looks toward the correct interpretation are later for children with CIs than for children with NH. This indicates that children with CIs need more time to process the morphosyntactic cues of case and verb agreement. The cause for this delay may again be due to insufficient quantity and quality of input, resulting in less automatized and thus slower processing. Similar effects of degraded speech on complex sentence structures have been found for the processing of speech in noise by adults with NH (Carroll & Ruigendijk, 2013). Alternatively, this delay may be caused by a reduction in working memory capacity and hence less capacity to revise an initially incorrect interpretation for an object question, which may have slowed down processing and hence resulted in a delay. However, it is unclear whether the reduction in working memory capacity is a cause or a consequence of language performance. This is an interesting question for further research.
Children with CIs and children with NH specifically differ from each other in their processing of object questions that are disambiguated by case at the wh-phrase. Whereas children with NH rely on the case marking cue on the sentence-initial wh-phrase and do not show a preference for the agent-first interpretation, children with CIs do not appear to rely on the case marking cue and show an initial preference for the agent-first interpretation. So unlike children with NH, children with CIs do not use the case cue on the wh-phrase immediately, although they can perceive it. Nevertheless, they do not completely ignore the early case cue, as gaze patterns of subject questions disambiguated by case diverge from those of object questions disambiguated by case already before the second case cue on the second NP. The gaze patterns suggest that children with CIs are less certain of the perceived accusative case cue on the sentence-initial wh-phrase in object questions. This uncertainty may partially be caused by a lack of perceptual saliency of the case cue, but more likely it is caused by an insufficient quality and quantity of linguistic input during language development, resulting in less automatized processing. Children with CIs thus need more time to process the initial case cue than children with NH. For object questions disambiguated by verb agreement alone or by verb agreement and additional case marking on the second NP, no differences in gaze patterns were found between children with CIs and children with NH. In these conditions, children with NH also initially misinterpret object questions and revise their interpretation later in the sentence.
In this study, we have found a poorer use of morphosyntactic cues in the comprehension and processing of whichquestions by children with CIs who can perceive these cues than by children with NH, leading to problems in understanding wh-questions. Morphosyntactic cues do not only play an important role in the understanding of wh-questions but also in the understanding of other complex sentence structures, for example, topicalized sentences or relative clauses. A poor understanding of complex sentence structures may hinder good communication and affect performance at school and in other everyday situations.
To conclude, children with CIs have problems in language comprehension and processing. Even if children with CIs perceive case and verb agreement, their ability to use these cues for the comprehension and processing of complex sentence structures still lags behind normal development, which is reflected in lower performance rates and longer processing times. Individual variability within the CI group can partly be explained by working memory and hearing age. Note. CI = cochlear implant; M = male; F = female; y = yes; AB = advanced bionics; AB = advanced bionics; n = no.