27 Connecting the three dots: An HSB Model

Author

Luke Miratrix

This handout shows (1) a mathematical model, (2) the lmer syntax for that model, and (3) the output for that model, for the model discussed in Lecture 2.4. This handout is designed to help translate between these three different worlds.

27.1 The mathematical model

Level 1 models: \[ \begin{aligned} y_{ij} &= \beta_{0j} + \beta_{1j} ses_{ij} + \beta_2 female_{ij} + \epsilon_{ij} \\ \epsilon_{ij} &\sim N( 0, \sigma^2_y ) \\ \end{aligned} \]

Level 2 models: \[ \begin{aligned} \beta_{0j} &= \gamma_{00} + \gamma_{01} sector_j + \gamma_{02} meanSES_j + u_{0j} \\ \beta_{1j} &= \gamma_{10} + \gamma_{11} sector_j + u_{1j} \end{aligned} \] with \[ \begin{pmatrix} u_{0j} \\ u_{1j} \end{pmatrix} \sim N \begin{bmatrix} \begin{pmatrix} 0 \\ 0 \end{pmatrix}\!\!,& \begin{pmatrix} \tau_{00} & \tau_{01}\\ & \tau_{11} \\ \end{pmatrix} \end{bmatrix} . \]

The \(\tau_{01}\) is the covariance of the random intercept and random slope. We usually look at the correlation of

\[ \rho = \frac{ \tau_{01} }{ \sqrt{ \tau_{00} \tau_{11} } } . \]

The estimated \(\rho\) is what R gives us in the printed output, rather than \(\tau_{01}\).

The derivation of the reduced form is:

\[ \begin{aligned} y_{ij} &= \beta_{0j} + \beta_{1j} ses_{ij} + \epsilon_{ij}\\ &= \left( \gamma_{00} + \gamma_{01} sector_j + \gamma_{02} meanSES_j + u_{0j} \right)+ (\gamma_{10} + \gamma_{11} sector_j + u_{1j}) ses_{ij} + \beta_2 female_{ij} + \epsilon_{ij} \\ &= \gamma_{00} + \gamma_{01} sector_j + \gamma_{02} meanSES_j + u_{0j} + \gamma_{10}ses_{ij} + \gamma_{11} sector_j ses_{ij} + u_{1j} ses_{ij} + \beta_2 female_{ij} + \epsilon_{ij} \\ &= \gamma_{00} + \gamma_{01} sector_j + \gamma_{02} meanSES_j + \gamma_{10}ses_{ij} + \gamma_{11} sector_j ses_{ij} + \beta_2 female_{ij} + \left(u_{0j} + u_{1j} ses_{ij} + \epsilon_{ij} \right) \end{aligned} \] This formula is what we will give to lmer() in R’s formula notation.

27.2 How many parameters?

It is useful to be able to identify all the parameters being estimated, which is why I frequently ask to count the number of parameters. Let’s do that for the above.

There are generally two kinds of parameters: the regression coefficients and the variance parameters. The regression coefficients are all the parameters that have no letter subscripts, since they are fixed parameters that describe our entire population. All the things with letter subscripts, e.g., \(\beta_{0j}\), are specific to some group–we would estimate those with empirical bayes after we fit our model, but we are not estimating those parameters directly when we first fit our model. So in the above model, we would have two cluster-specific parameters for each cluster. 160 clusters, so 320 such parameters, none of which are part of our main model.

So in the model above we have a \(\beta_2\) at level 1 (so it is the same for all the clusters) and 5 \(\gamma_{\cdot\cdot}\) parameters at level 2.

For the variances, we often have a level one residual variance (unless we have a generalized model such as logistic or poisson where there is no variance term for level 1), and then the variances of the random effects. Each level will have their own variances, and the number of parameters depends on the size of the matrix (a 2x2 matrix has 3 parameters, 2 on the diagonal and 1 off-diagonal, for example).

In the case above, this would give 1 + 3 = 4 more variance parameters.

Total parameters is therefore 1+5 = 6 regression coefficients, and 1+3 = 4 variance, for a total of 10 parameters.

27.3 The lmer code

M1 = lmer( mathach ~ 1 + female + ses*sector + 
             meanses + (1+ses|id),
           data = dat )

This code is the exact same model, using the fact that ses*sector means ses + sector + ses:sector. I.e., the above is exactly the same as this more explicitly written R code:

M1 = lmer( mathach ~ 1 + sector + meanses + ses + sector:ses + female + (1+ses|id),
           data = dat )

Each term in the expanded formula corresponds to a math symbol in the mathematical model. The (1+ses|id) make our random effects, and tie to all the \(\tau\) terms. The residual variance \(\sigma^2_y\) is the only parameter not explicitly listed in the above model.

27.4 The output

display( M1 )

lmer(formula = mathach ~ 1 + sector + meanses + ses + sector:ses + 
    female + (1 + ses | id), data = dat)
            coef.est coef.se
(Intercept) 12.79     0.21  
sector       1.29     0.29  
meanses      3.04     0.37  
ses          2.73     0.14  
female      -1.18     0.16  
sector:ses  -1.31     0.21  

Error terms:
 Groups   Name        Std.Dev. Corr 
 id       (Intercept) 1.45          
          ses         0.18     0.65 
 Residual             6.05          
---
number of obs: 7185, groups: id, 160
AIC = 46482.9, DIC = 46445.1
deviance = 46454.0

Now, using this output, we have estimates for all our mathematical modeling parameters:

\(\gamma_{00} = 12.79\) - The overall average math achievement for a student with 0 ses in a public school with 0 mean SES.
\(\gamma_{01} = 1.29\) - The average difference between otherwise equivilent catholic and public schools.
\(\gamma_{02} = 3.04\) - The impact on average achievement due to mean SES of schools. Higher SES schools have higher achievement.
\(\gamma_{10} = 2.73\) - The average slope of ses vs. math achievement in public schools.
\(\beta_2 = -1.18\) - The gender gap; girls have lower math scores on average.
\(\gamma_{11} = -1.31\) - The difference in slope between public and catholic schools (catholic schools have flatter slopes).
\(\tau_{00} = 1.45^2\) - Variation in overall intercept of schools (within category of public or catholic, and beyond mean SES).
\(\tau_{11} = 0.18^2\) - The variation in the random slopes for ses vs. math achievement.
\(\rho = 0.65\) - The random intercepts are correlated with random slopes. High achievement schools have more discrepancy between low and high ses students.
\(\sigma_y = 6.05\) - The unexplained student variation within school.