Predicting Literacy Skills at 8 Years From Preschool Language Trajectories: A Population-Based Cohort Study

Purpose: This article explored the predictive values of three main language delay (LD) trajectories (i.e., persistent, late onset, and transient) across 3–5 years on poor literacy at 8 years. Additionally, the effect of gender was assessed, using both gender-neutral and gender-specific thresholds. Method: The data comprised mother-reported questionnaire data for 8,371 children in the Norwegian Mother, Father, and Child Cohort Study. Analyses were conducted using binary logistic regression in SPSS to make predictions about risk. Results: LD reported at preschool age was associated with excess risk of poor literacy at 8 years with odds ratios ranging from 3.19 to 9.75 dependent on trajectory, persistent LD being

M ost young children quickly attain high efficiency in literacy at school. Nevertheless, around 5%-10% of children struggle with reading and writing despite adequate learning opportunities (Landerl & Moll, 2010). Poor literacy skills in the first school years tend to have a negative impact on children's later academic and social development (Conti-Ramsden et al., 2009, 2013 and early identification of children at risk is thus paramount to minimize subsequent poor literacy and academic problems. One important risk factor for literacy problems is delayed language development at preschool age (e.g., McLeod et al., 2019;Snowling & Hulme, 2012;van Viersen et al., 2018). Oral language skills at preschool age have been found to be powerful predictors of subsequent literacy outcomes in children with impaired language development (e.g., Bleses et al., 2016;Catts et al., 2002Catts et al., , 2014Hulme et al., 2015;Johnson et al., 1999;Law et al., 2012;Morgan et al., 2011;Paul et al., 1997;Rescorla, 2002Rescorla, , 2005Rescorla, , 2009Rescorla, , 2013Snowling et al., 2000Snowling et al., , 2016van Viersen et al., 2018). Catts et al. (2002), for example, found that oral language skills of children with language impairment, measured at kindergarten, accounted for a unique amount of variance in reading skills at Grade 2 and Grade 8 (Catts et al., 2014).
Increased risk for poor literacy in school-aged children with late emerging language (i.e., identified between 20 and 34 months) has been reported by Paul et al. (1997) and Rescorla (2002Rescorla ( , 2005Rescorla ( , 2009Rescorla ( , 2013. These studies showed that most late-talking toddlers (84%-94%) outgrew their language delay (LD). Nevertheless, late-talkers with language skills in the normal range at a later age were found to still be at risk for literacy problems at school age (Rescorla 2002(Rescorla , 2005(Rescorla , 2009(Rescorla , 2013. This phenomenon has been referred to as illusory recovery (Scarborough & Dobrich, 1990). Based on this longitudinal evidence, Rescorla (2002Rescorla ( , 2005Rescorla ( , 2009Rescorla ( , 2013 has argued for a notion of a language endowment spectrum/ dimension, which was previously suggested by Bishop and Edmundson (1987) and Leonard (1991). According to this dimensional account of language ability, children with short-lived as opposed to persistent language difficulties a Department of Child Development, Norwegian Institute of Public Health, Oslo, Norway b differ along a quantitative dimension; regardless of whether their language problems persist or not, late takers have below average language endowment and are therefore likely to continue to be weaker in language-and literacy-related skills than their peers without a history of LD.
A number of important questions regarding the links between language and literacy remain to be examined. Previous longitudinal studies have mainly examined the presence or absence of LD/impairment, and very few studies have taken into account the heterogeneity in language development during childhood. There is abundant evidence for distinct patterns of language development during childhood characterized by typical, transient (i.e., children with LD at an earlier age but normal language development at a later age), late-onset (i.e., normal language development at an earlier age but delayed at a later age), and persistent language problems (Henrichs et al., 2011;Law et al., 2012;McKean et al., 2015;Peyre et al., 2014;Snowling et al., 2016;Zambrana et al., 2014). Only two recent longitudinal studies have evaluated literacy outcomes in children of different developmental trajectories (Janus et al., 2019;Snowling et al., 2016). In both studies, the language trajectories were categorized from a preschool age to an age after school entry, the same time point when the literacy skills were evaluated. As such, the studies involved examining concurrent relations between language and literacy skills. To our knowledge, the impact of preschool trajectories on risk for school-aged literacy problems has not been examined. Modeling language trajectories before school entry would be informative to understand the pathway by which problems in early oral language skills lead to poor literacy. One important aim of this study was to quantify the association between LD trajectories across preschool years and later literacy skills using parent report measures in a population sample. Parent report has been proved to be a practical and effective means of obtaining information about child development (Ebert, 2017;Feldman et al., 2005;Libertus et al., 2015;Sachse & Von Suchodoletz, 2008) and is commonly used for assessment of children's language and literacy for research and screening purposes. Clearly, the literature on the relationship between oral language and literacy will benefit from consideration of the language and literacy skills derived from parent reports.
Previous studies have not considered a specific gender hypothesis regarding language and literacy links. Clear gender differences in childhood language development have been observed, and a consistent female advantage has been reported when children are very young (e.g., Baye & Monseur, 2016;Berglund et al., 2005;Eriksson et al., 2012;Simonsen et al., 2014). In research on language development of Norwegian children, a significant gender difference in favor of girls has, for example, been reported by Richter and Jason (2007) using the Norwegian version of Ages and Stages Questionnaires (ASQ; Janson & Smith, 2003) and Simonsen et al. (2014) using the Norwegian version of MacArthur-Bates Communicative Development Inventories (Kristoffersen & Simonsen, 2012). Both studies indicated a need for gender-dependent norms. Findings about the effect of gender on literacy, however, is rather mixed. While a significant body of research found that girls outperformed boys on literacy measures (e.g., Baye & Monseur, 2016;Marks, 2008), other studies have reported no such gender gap (e.g., Barron et al., 2006;Siegel & Smythe, 2005;Vlachos & Papadimitriou, 2015). Additionally, many studies have reported higher incidence of reading difficulties and of language impairment in boys (e.g., Hawke et al., 2007;Rutter et al., 2004;Tomblin et al., 1997), with male-to-female ratios ranging from 1.3:1 to 5.9:1 for language impairment (Webster & Shevell, 2004), and from 2:1 to 15:1 for reading difficulties (Hawke et al., 2009). An alternative view considers the gender difference to reflect a methodological artifact arising from referral bias (e.g., Prior et al., 1995;Shaywitz et al., 2008). Against this context, it seems surprising that gender differences have been largely ignored in previous studies of the association between oral language and literacy. To our knowledge, only two longitudinal studies have examined the gender effect on the predictive relationship between oral language and literacy (Bleses et al., 2016;Hohm et al., 2007). Both studies showed that the predictive significance was gender dependent. However, Hohm et al. (2007) found a higher predictive value in girls than in boys, while Bleses et al. (2016) found exactly the opposite. Taking into account the above findings, this study aims to assess the effect of gender on the association between oral language and literacy. In particular, we will apply both gender-neutral and genderspecific cut-points on language and literacy measures to explore whether the potential gender differences depend on cutoff values.
An additional point of consideration for the association between oral language and literacy is the inclusion of confounders. Most previous studies have not adequately adjusted for potential confounders that might explain this association. An examination of the literature indicates that the number and types of covariates adjusted for vary greatly from one study to another. NICHD Early Child Care Research Network (2005), for example, included only one control measure, namely, maternal verbal intelligence. Hohm et al. (2007), on the other hand, controlled for a comprehensive list of pre-and perinatal risk factors (e.g., prenatal complications, preterm labor, low birth weight) and psychosocial risk factors (e.g., low parental education, psychiatric disorders, single-parent family, unwanted pregnancy). It is well known that many different biological and environmental factors influence the relationship between early language skills and reading (Hayiou-Thomas et al., 2010). Based on a systematic review, Chaimay et al. (2006) claimed that the following factors should be considered as confounding factors in language development: antenatal care, Apgar scores, birth weight, premature delivery, birth order, parental education, gender of the children, and family history with specific language impairment. Furthermore, socioeconomic status and family history of dyslexia have been identified as risk factors for literacy problems (e.g., Sirin, 2005;Snowling et al., 2016); psychosocial indicators, such as temperament and behavior characteristics, have been argued to be confounded with language ability (Zubrick et al., 2007 ). It is therefore important to adjust for potential confounders that might contribute to the observed relationship in order to have a more accurate estimate of the predictive value of oral language skills on later literacy.

Current Study
The data utilized in this study come from The Norwegian Mother, Father, and Child Cohort Study (MoBa). MoBa is a large-scale population-based prospective pregnancy cohort study with the primary goal to detect causes of diseases through estimation of specific exposure-outcome associations among the children and their parents (Magnus et al., 2006(Magnus et al., , 2016. In addition to examining exposure-outcome associations while adjusting for a number of relevant potential confounders, the study design also provides an opportunity to explore the emergence of developmental problems (i.e., timing) and the importance of different child developmental trajectories on their literacy skills.
We first categorized different LD trajectories from 3 to 5 years and then performed binary logistic regression analyses to make predictions about risk. We also explored gender effects on the association between oral language and later literacy. Based on previous literature presented above, we hypothesize that (a) the presence of LD, whether recovering, late-onset, or persistent, at preschool age is associated with elevated risk for poor literacy at 8 years; (b) the association between LD and literacy is stronger in children with persistent LD than children with LD that are either late onset or transient during preschool age; and (c) the predictive significance is gender dependent, with higher odds for poor literacy in girls than in boys.

Participants
The sample comprised children participating in the MoBa. Pregnant women were recruited from hospitals and maternity units all over Norway from 1999 to 2009, and 41% of invited women consented to participate. Consenting women received three questionnaires during pregnancy: in Gestational Week 17, Week 22, and Week 30. They later received questionnaires after delivery, when their child was aged 6 years 18 months and 3, 5, 7, and 8 years (questionnaires are available at http://www.fhi.no/moba). Data collection is still ongoing. The cohort comprises 114,500 children, 95,200 mothers and 75,200 fathers. Data are linked to information from the Medical Birth Registry of Norway (MBRN), which provides mandatory notification on all births in Norway. The establishment of MoBa and initial data collection was based on a license from the Norwegian Data Protection Authority and approval from the Regional Committees for Medical and Health Research Ethics. The MoBa cohort is based on regulations based on the Norwegian Health Registry Act. The current study was approved by the Regional Committees for Medical and Health Research Ethics.
This study used data based on the quality-assured MoBa files released in 2015. The 8-year data collection was still ongoing at the time, and the 8-year questionnaire has three different versions: A, B, and C. The present analyses used Version C of the 8-year questionnaire (N = 11,838), the only version where the literacy measure was included. Of these, only children with returned questionnaires at ages 3 and 5 years were included, resulting in a sample size of 8,731 children. To explore potential selection bias, we compared baseline characteristics between participants included in the study sample and those excluded by using t test (all variables were treated as continuous traits). For about half of the variables compared, there were no significant differences between the two samples. Very modest differences (Cohen's ds ranging from 0.04 to 0.13) were observed for (a) child birth weight, (b) having a non-Norwegian-speaking parent, and (c) parental age and education (see Appendix).

Oral Language
At both ages 3 and 5 years, children's oral language skills were reported by their mothers on the ASQ communication scales (Squires et al., 1999). The ASQ is a parentcompleted developmental screening tool with good validity, reliability, and accuracy (Squires et al., 1999). The communication scales include questions pertaining to both receptive and expressive language skills, allowing assessment of language and communication development from 1 to 60 months. The questionnaires were translated into Norwegian by Janson and Smith (2003) and have been found to be effective diagnostic tools of developmental delay for Norwegian children (Richter & Janson, 2007). The communication scales of ASQ have been used as language assessment tools to identify LDs and track language trajectories in previously published MoBa-based research (Helland et al., 2018;Schjølberg et al., 2011;Wang et al., 2014;Zambrana et al., 2014).
In the MoBa 3-year questionnaire, we used a six-item language scale composed of four items from the ASQ 3-year communication subscale (e.g., "Does your child make sentences that are three or four words long?"), one item from the ASQ 18-month scale (i.e., "Without showing him/her first, does your child point to the correct picture when you say, 'Where is the cat' or 'Where is the dog'?"), and one item from the ASQ 48-month scale (i.e., "Can your child tell you at least two things about an object he/she is familiar with?"). The two items, which were developed either for younger or older children, were included in the 3-year questionnaire to more reliably assess language development at the extreme ends of the distribution. For the 5-year questionnaire, the six original communication items from the ASQ 5-year scale were used (e.g., "Without giving your child help by pointing or repeating directions, does your child follow three directions that are unrelated to one another?"). All items from ASQ have three response options (1 = yes, 2 = sometimes, 3 = not yet), and the scores were converted according to the ASQ manual (1 = 10, 2 = 5, 3 = 0). Total ASQ scores at ages 3 and 5 years both included the full range of possible scores (0-60; M = 56.63, SD = 5.56 at age 3 years; M = 56.52, SD = 5.73 at age 5 years). The ASQ scales at both age points had excellent internal consistency measured with polychoric ordinal alphas (cf. Gadermann et al., 2012) of .93 and .91, respectively.

Literacy
Children's literacy skills at 8 years were measured by five items from the subscale on Writing within the Communication domain in the Vineland Adaptive Behaviour Scale-II (Sparrow et al., 2005). One item targets letter identification ("Identifies all lowercase printed letters and uppercase of the alphabet"), two items target reading ("Reads simple stories aloud, with ease" and "Reads and understands texts suitable for 7-8 year olds, e.g. simple children's books, cartoon"), and the remaining two items tap into writing competence ("Writes simple information/messages at least three sentences long"; "Writes reports, papers, or essays at least one page long"). Like the ASQ, all items have three response options (1 = yes, 2 = sometimes, or partially, and 3 = no never). The scores were converted in the same way as the ASQ items. The literacy score included the full range of possible scores in this sample (0-50; M = 43.30, SD = 8.75). The literacy scale had excellent internal consistency with a polychoric ordinal alpha of .90.
At age 8 years, all Norwegian children take a mandatory Norwegian reading comprehension test. It is mandatory for teachers to inform the parents if a child scores in the range of concern. Mothers reported the feedback they got from the teachers as to whether (a) their child mastered the subject well; (b) must work more on reading, but teacher was not concerned; or (c) teacher was concerned. This information was used to validate mother-reported child literacy skills at age 8 years. The correlation between the motherreported Vineland literacy measure and teachers' feedback on children's Norwegian reading comprehension tests was quite high (r = .62), indicating considerable consistency between the two measures.

Covariates
All analyses were adjusted for a number of familyand child-related characteristics that might otherwise have confounded the statistical association between oral language and literacy. Family-related variables including parity and parental age at the child's birth were collected through MBRN. Family income and parental education were collected through a questionnaire completed in early pregnancy. Information about family history of languagerelated difficulties (0 = no, 1 = yes), including LD, reading and writing difficulties, and speech sound difficulties, was collected through the 5-year questionnaire.
Information about the child at birth was retrieved from the MBRN. Child variables used were gender (1 = girl, 2 = boy), birth weight (low < 2,500 g, 2,500 g ≤ middle ≤ 4,500 g, high > 4,500 g), gestational age in weeks (low ≤ 36, 37 ≤ middle ≤ 38, normal/high ≥ 39), serious congenital malformation including syndromes and neurological disorders detected at birth (1 = no, 2 = yes), Apgar scores 5 min after birth (coded 1 for > 8, 2 for ≤ 8). Mother's report of child's reduced hearing assessed by a professional at age 5 years or earlier and parental report of having a non-Norwegianspeaking parent (1 = no, 2 = yes) were collected. The child's exact age at the completion of 3-and 5-year questionnaires was calculated. Almost all the covariates correlated significantly with the predictor and/or outcome variables, but the correlation coefficients were all very modest, ranging from .04 to .11, except for the correlation between gender and 8-year literacy, which was .21.

Statistical Analyses
In line with previous studies (e.g., Law et al., 2012;Zambrana et al., 2014), we set the cut-point for LD and poor literacy at 1.5 SDs below the mean. The children were assigned into four LD trajectory groups: no LD (i.e., no LD at either age), transient (i.e., LD at 3 years and no LD at 5 years), late-onset (i.e., no LD at 3 years and LD at 5 years), and persistent LD (i.e., LD at both ages), based on their LD status at 3 and 5 years. Binary logistic regression analysis was performed to explore the risk associated with each LD trajectory (with "no LD" as the reference group) and later literacy problems (0 = good literacy, 1 = poor literacy). All analyses were performed using SPSS Version 23.0 for Windows (IBM Inc, 2015), with a significance level of 5% in all cases.

Missing Data
Little's Missing Completely at Random test (chisquare = 3253.5, df = 1567, p < .001) revealed that our data were not missing completely at random. For the 17 language and literacy items, 12 had a missing percentage less than 1%, four had a missing percentage between 1.5% and 3.4%, and one had a missing percentage of 10.4%. Missing values for the ASQ-3year, ASQ-5year, and Vineland literacy scales were imputed individually per scale using the SPSS Miss Value Analysis (Expectation Maximization) imputation procedure. Data from respondents with more than three values missing on ASQ and more than two values on Vineland were excluded from the analyses. For the covariates, the numbers of missing were small (10 ≤ N ≤ 71), so the missing values were replaced by the smallest values in the corresponding response category. For example, the 10 missing values in congenital malformation variable were replaced by 1 (1 = No). The final sample size after imputation was 8,371, or 95.9% of 8,731 children with returned questionnaires at 3, 5, and 8 years.

Descriptive Data
The three main language and literacy variables all correlated significantly with each other. Three-and 5-year oral languages were positively correlated, r(8,368) = .47, p <. 001. The correlation coefficients between 8-year literacy and oral language at 3 and 5 years were, respectively, .29 and .32. Table 1 shows the means and standard deviations in language and literacy measures by gender. Girls on average had higher scores than boys on all measures. The gender differences were all significant. However, they represented small or moderate effects (Cohen, 1988). Applying the deficit criteria of 1.5 SDs categorized 5.5% (N = 457) of the children with LD at age 3 years, 6.4% (N = 533) at age 5 years, and 10.6% (N = 888) as having poor literacy at 8 years. LD at both age points and poor literacy at 8 years were more prevalent in boys than in girls (7.2% vs. 3.7% at 3 years, 7.3% vs. 5.5% at 5 years, 14.8% vs. 6.4% at 8 years). In this sample, 90.0% (N = 7,537) had no LD, 3.6% (N = 301) had transient LD, 4.5% (N = 377) had late-onset LD, and 1.9% (N = 156) had persistent LD. The boys-to-girls ratios were estimated at 0.98, 1.95, 1.14, and 2.12, respectively, for the four different trajectory groups. Table 2 shows mean ASQ scores (standard deviation) for each trajectory group by gender at both age points. An analysis of variance test with post hoc test (Games-Howell) revealed that there were significant differences in mean scores between all trajectory subgroups at each age point for boys as well as for girls.

Association Between LD Trajectories and Poor Literacy
The binary logistic regression analysis showed that all three LD trajectories at preschool age significantly increased the risk for later literacy problems (p < .001 for all). Odds ratio (OR) was used to determine if a given exposure is a risk factor for the outcome. An OR of 1 indicates that the exposure does not affect the odds of outcome, whereas an OR above or below 1 indicates higher/lower odds of the outcome, respectively. In order to explore whether there was a gender effect in the association between LD trajectories and later literacy, we tested a Trajectory × Gender interaction. As the gender interaction effect was significant ( p = .001), we performed the logistic regression analyses separately for boys and girls, using a gender-neutral cut-point (i.e., cut-point based on group mean and standard deviation of a pooled sample of boys and girls) for all three major measures. The results revealed a clear gender difference for children with persistent LD and late-onset LD trajectories. Compared to boys in the same trajectory group, the ORs for poor literacy were more than tripled in girls with persistent LD (OR = 18.21 vs. OR = 5.30) and more than doubled in girls with late-onset LD (OR = 4.56 vs. OR = 2.27). The trend remained the same for transient LD, but only with a slightly higher OR in girls than in boys (OR = 3.13 vs. OR = 2.42). These results are summarized in Table 3.
In the present sample, the gender-neutral deficit criteria of 1.5 SDs identified more boys with LD and literacy problems than girls. The lower prevalence for both the predictor and the outcome variables in girls may imply that the initial comparisons with gender-neutral cut-points are not fully valid. OR estimates depend on the threshold values decided for both the exposure and the outcome measures. Therefore, a supplementary analysis using gender-specific cut-points (i.e. cut-point based on the distribution of a variable for each gender) for LD and poor literacy was performed to check to what extent this gender disparity was dependent on thresholds. For girls, we applied the same deficit criteria of 1.5 SDs as before and the cutoff score remained the same. For boys, we chose a new cut-point in order to have approximately the same prevalence of LD and poor literacy as in girls. Ideally, we would prefer exactly the same prevalence in boys and girls, but due to the stepwise distribution of the scores, this was not possible. The prevalence rates in boys after applying the more stringent deficit criterion of 2.0 SDs were more similar to that of girls, particularly at 3 years, although still a little higher in boys than girls at 5 and 8 years (3.6% vs. 3.7% for LD at 3 years, 7.3% vs. 5.5% for LD at 5 years, and 8.6% vs. 6.4% for poor literacy). The ORs after adjustment for the covariates increased to 2.55 (95% CI [1.41, 4.59]) in boys with transient LD, to 2.57 (95% CI [1.78, 3.73]) in boys with late-onset LD, and to 9.92 (95% CI [5.96-16.51]) in boys with persistent LD. After the prevalence differences were compensated for, there remained a trend of higher ORs for girls than for boys, but the gender difference was no longer significant, as indicated by the overlapping 95% CIs in each trajectory group.

Discussion
This study aims to shed light on the predictive relationship between different LD trajectories across preschool age and subsequent literacy problems, as well as gender differences in this relationship. Our data should be particularly informative about this relationship due to the very large sample size, the longitudinal design tracking children from 3 through 5-8 years, and the broad set of relevant confounding variables obtained from national registers.
Our results revealed that language skills that were 1.5 SDs below the mean at 3 and/or 5 years significantly increased the risk for poor literacy at 8 years, supporting our first hypothesis. In accordance with our second hypothesis, persistent LD was found to be the strongest predictor of poor literacy. In contrast to the strong disadvantage for the persistent LD group, there is a relatively modest risk of literacy problems for children with transient or late-onset trajectories. However, compared to the reference group of typical language (no LD) at both time points, the odds of poor literacy were still nearly tripled given the presence of LD at either time point. Taken together, the results confirmed the notion that preschool LD and literacy problems are highly associated. In particular, the vulnerability for later literacy problems will increase substantially when LD persists from 3 to 5 years.
Our finding about excess risk of transient LD for poorer literacy at 8 years is consistent with Scarborough and Dobrich's proposal of illusory recovery and dimensional perspective on LD (Bishop & Edmundson, 1987;Leonard 1991). Interestingly, our results showed that the OR of transient LD was similar to that of late-onset LD, either before or after adjusting for covariates. This indicates that, in this sample, children with transient LD, who appeared to have overcome their spoken language difficulties by 5 years, had comparable risk for later literacy problems to children whose LD emerged first at age 5 years. As such, our findings are only partially in agreement with those of Snowling et al. (2016). Using a small sample of children at high risk of dyslexia (due to familial risk or preschool language impairment), their study found that children whose language impairment emerged at 8 years, or problems that persisted from 3 to 8 years, performed worse than typically developing peers on all literacy-related measures at 8 years. By contrast, the "resolving" group (i.e., language impaired at 5 years, typical language development at 8 years) had relatively good outcomes. A similar finding about a "good" literacy outcome for the transient LD was reported in Dale et al. (2014). Tracking a sample of children whose early LD was recovered by 4 years, they found risk for literacy in these children no higher than the control group who had equivalent scores in vocabulary and grammar at 4 years. Both studies noted, however, that the language skills of the transient group were in the low normal range Note. Certain covariate variables may be the subjects of other research, so in compliance with MoBa's policy, odds ratios for the covariates were not reported in this article. LD = language delay. a Adjusted for family income, maternal age, paternal age,* maternal education, paternal education, parity, family history of language-related difficulties, including language delay, reading and writing difficulties,* and speech sound difficulties, child's gender,* birth weight, gestational age in weeks, serious malformation, Apgar scores 5 min after birth, child's exact age at the completion of 3-and 5-year questionnaires, mother's report of child's reduced hearing at 5 years, and having a non-Norwegian-speaking parent.* Effects of covariates marked * reached significance, p < .05.
when measured after school entry, and that this subclinical weakness could elevate the risk for later literacy problems. Further support illusory recovery comes from Janus et al. (2019), who tracked children experiencing early speech and language pathologies, and found that children with transient speech and language pathologies were more likely to fall below standard achievement in reading and writing than controls. Overall, the pattern of the results in this study is consistent with previous research showing that preschool LD is strongly linked to poor literacy after school entry, though it could be that, in a particular sample, and by using different assessment measures and definition of LD, the difference in literacy skills between children with transient LD trajectory and the control group may or may not be significant. Seemingly consistent with our hypothesis of genderdependent predictive significance, a significant gender difference was initially observed in the predictive power of all LD trajectories and most strikingly for the persistent group.
The predictive values of the LD trajectories were found to be higher in girls than in boys. This pattern of results seems to suggest that the presence of LD during the preschool years is more detrimental for literacy in girls than in boys. This observation is generally in line with Hohm et al. (2007), but is contradictory with Bleses et al. (2016). While the mechanisms underlying this gender difference is beyond the scope of this study, our finding suggests that a proportion of the gender difference may depend on choice of cutoff criteria. In our main analyses, a gender-neutral cutoff for all three language and literacy measures was used. As boys had lower scores on the three measures, the uniform deficit criteria of 1.5 SDs identified more boys than girls, resulting in more extreme scoring girls being included. The much larger ORs for girls than for boys for persistent LD and late-onset LD are consistent with this fact. In our supplementary analysis where a more stringent cutoff was used for boys and prevalence differences in boys and girls were compensated for, much of the gender difference declined and was no longer significant. This suggests that the finding of significant gender differences may be due to a methodological artifact. We thus argue for similar effects of preschool LD trajectories on literacy for boys and girls. More research exploring gender differences in the association between language and literacy is warranted. Future research could further explore the effect of cutoff choice on the relative severity of the gender groups and on the gender difference in the effect of different LD trajectories.

Limitations
Despite numerous strengths, our study has some methodological limitations that should be addressed in future research. First, both the exposure measures and the outcome measure in this study were reliant on maternal report. Some mothers may tend to overreport their child's competence in both language and literacy, others to consequently underreport it (Bavin et al., 2008). This may have produced a spurious relationship. To the extent that the error terms for 3-year oral language, 5-year oral language, and 8-year literacy are not correlated, this will attenuate the effect sizes. To the extent that they are correlated (i.e., mothers who overreport problems at 3 and 5 years also overreport problems at 8 years, and those who underreport at ages 3 and 5 years also underreport at 8 years), this will inflate the effect estimates. However, we validated the literacy measure completed by mothers against teacherassessed reading competence, finding that correlated errors cannot account for much of the observed effects. Second, independent of maternal ability to report objectively, our instruments are short (only six items for the oral language measure; five items in the literacy measure) and despite their high reliability (i.e., internal consistency), they provide only a broad measure of the complex oral language and literacy concepts, particularly at 5 years. Third, although this study sample is large (N = 8,371), it is likely not to be representative of the whole Norwegian population. Like all longitudinal studies, the MoBa study suffers from some attrition (Biele et al., 2019;Nilsen et al., 2009). Furthermore, a recruitment bias of the study sample makes the data less representative of low-income families, stressed mothers, parents with low education, children of low birth weight, and children who were exposed to a non-Norwegian language at home. However, earlier research has indicated that, despite relative differences in prevalence estimates between MoBa participants and the total population, associations between variables are fairly robust in the MoBa (Nilsen et al., 2009). One further weakness is that nonverbal IQ, which previous studies indicate is a potential confounding variable for the oral language and literacy association, was not controlled for in this study. Ekins and Schneider (2006) emphasized the importance of controlling for nonverbal abilities in order to ensure that the variance accounted for can be attributed to oral language skills and not to nonverbal abilities. Clearly, further research addressing these methodological limitations is warranted in the study of the association between oral language and literacy.

Conclusion
Using a population-based cohort, this study confirmed the longitudinal relationship between oral language and later literacy skills. Our findings from LD trajectories showed that the presence of LD across the preschool years significantly increased the risk for later literacy problems, with persistent LD being the strongest predictor. Moreover, our findings about the elevated risk for poor literacy in the transient LD group lend support to the "illusory recovery" hypothesis. Finally, a gender effect on the relationship between LD trajectories and literacy was not supported. The significant gender difference detected when using a gender-neutral threshold was diminished to nonsignificance when a more stringent deficit criterion was used for boys. Therefore, gender appears to play a negligible role in the relationship between oral language and literacy, with the gender difference seemingly representing a methodological artifact.
In conclusion, our findings highlight the necessity of monitoring oral language trajectories across the preschool years to ensure that all children are on the right track for learning at school entry. Although the benefits of early identification of language and literacy problems followed by treatment are widely recognized in the literature (e.g., Nelson et al., 2006), it remains undetermined at what age a focus on detecting LD should be prioritized. Results from this study indicate that early identification of children at risk for poor language may be accomplished from 3 years, a point previously made by Scarborough (2005). Parents of children with delayed language at this age should seek timely intervention and support in order to prevent possible literacy problems in subsequent school years. Even if no LD is identified at 3 years, parents should pay continued attention to the strengths and weaknesses of their children's language, as delayed onset of LD may occur at a later age. The children at risk should be further followed up by the teachers at school entry, as previous studies (e.g., Morgan et al., 2011;Paul et al., 1997;Rescorla, 2002;Snowling et al., 2000) indicate that the risk for poor literacy in children with LD may further increase when reading becomes more established, if no intervention is provided to them. As a final note, the gender difference in relation to cutoff criteria suggests that it may be not advisable to use a uniform threshold for boys and girls. Practitioners should consider gender-specific cutoffs in relation to language and literacy measures.