library( lme4 )
library( foreign ) ## to load data
library( arm )
library( tidyverse )
20 Extrating information from fitted lmer
models using base R
This chapter follows Chapter Chapter 19, and provides an alternate set of ways of pulling information from a fit lmer
model. In particular, this document walks through various R code to pull information out of a multilevel model (and OLS models as well, since the methods generally work on everything). For illustration, we will use a random-slope model on the HS&B dataset with some level 1 and level 2 fixed effects.
We use the following libraries in this file:
Loading the data is simple. We read student and school level data and merge:
= read.spss( "data/hsb1.sav", to.data.frame=TRUE )
dat = read.spss( "data/hsb2.sav", to.data.frame=TRUE ) sdat
re-encoding from CP1252
= merge( dat, sdat, by="id", all.x=TRUE )
dat head( dat, 3 )
id minority female ses mathach size sector pracad disclim himinty
1 1224 0 1 -1.528 5.876 842 0 0.35 1.597 0
2 1224 0 1 -0.588 19.708 842 0 0.35 1.597 0
3 1224 0 0 -0.528 20.349 842 0 0.35 1.597 0
meanses
1 -0.428
2 -0.428
3 -0.428
20.1 Fitting and viewing the model
Now we fit the random slope model with the level-2 covariates:
= lmer( mathach ~ 1 + ses + meanses + (1 + ses|id), data=dat ) M1
If we just print the object, e.g., by typing the name of the model on the console, we get minimal information:
M1
Linear mixed model fit by REML ['lmerMod']
Formula: mathach ~ 1 + ses + meanses + (1 + ses | id)
Data: dat
REML criterion at convergence: 46561.42
Random effects:
Groups Name Std.Dev. Corr
id (Intercept) 1.6417
ses 0.6731 -0.21
Residual 6.0659
Number of obs: 7185, groups: id, 160
Fixed Effects:
(Intercept) ses meanses
12.651 2.190 3.781
20.1.1 The display()
method
The arm
package’s display()
method gives an overview of what our fitted model is:
display( M1 )
lmer(formula = mathach ~ 1 + ses + meanses + (1 + ses | id),
data = dat)
coef.est coef.se
(Intercept) 12.65 0.15
ses 2.19 0.12
meanses 3.78 0.38
Error terms:
Groups Name Std.Dev. Corr
id (Intercept) 1.64
ses 0.67 -0.21
Residual 6.07
---
number of obs: 7185, groups: id, 160
AIC = 46575.4, DIC = 46552.4
deviance = 46556.9
20.1.2 The summary()
method
We can also look at the messier default summary()
command, which gives you more output. The real win is if we use the lmerTest
library and fit our model with that package loaded, our summary()
is more exciting and has \(p\)-values:
library( lmerTest )
= lmer( mathach ~ 1 + ses + meanses + (1 + ses|id), data=dat )
M1 summary( M1 )
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mathach ~ 1 + ses + meanses + (1 + ses | id)
Data: dat
REML criterion at convergence: 46561.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.1671 -0.7270 0.0163 0.7547 2.9646
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 2.695 1.6417
ses 0.453 0.6731 -0.21
Residual 36.796 6.0659
Number of obs: 7185, groups: id, 160
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 12.6513 0.1506 152.9599 84.000 <2e-16 ***
ses 2.1903 0.1218 178.2055 17.976 <2e-16 ***
meanses 3.7812 0.3826 181.7675 9.883 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) ses
ses -0.080
meanses -0.028 -0.256
20.2 Obtaining Fixed Effects
R thinks of all models in reduced form. Thus when we get the fixed effects we get both the level-1 and level-2 fixed effects all together:
fixef( M1 )
(Intercept) ses meanses
12.651300 2.190350 3.781218
The above is a vector of numbers. Each element is named, but we can index them as so:
fixef( M1 )[2]
ses
2.19035
We can also use the [[]]
which means “give me that element not as a list but as just the element!” When in doubt, if you want one thing out of a list or vector, use [[]]
instead of []
:
fixef( M1 )[[2]]
[1] 2.19035
See how it gives you the number without the name here?
20.3 Obtaining Variance and Covariance estimates
We can get the Variance-Covariance matrix of the random effects with VarCorr
.
VarCorr( M1 )
Groups Name Std.Dev. Corr
id (Intercept) 1.64174
ses 0.67309 -0.212
Residual 6.06594
It displays nicely if you just print it out, but inside it are covariance matrices for each random effect group. (In our model we only have one group, id
.) These matrices also have correlation matrices for reference. Here is how to get these pieces:
= VarCorr( M1 )$id
vc vc
(Intercept) ses
(Intercept) 2.6953203 -0.2339045
ses -0.2339045 0.4530494
attr(,"stddev")
(Intercept) ses
1.6417431 0.6730894
attr(,"correlation")
(Intercept) ses
(Intercept) 1.0000000 -0.2116707
ses -0.2116707 1.0000000
You might be wondering what all the attr
stuff is. R can “tack on” extra information to a variable via “attributes”. Attributes are not part of the variable exactly, but they follows their variable around. The attr
(for attribute) method is a way to get these extra bits of information. In the above, R is tacking the correlation matrix on to the variance-covariance matrix to save you the trouble of calculating it yourself. Get it as follows:
attr( vc, "correlation" )
(Intercept) ses
(Intercept) 1.0000000 -0.2116707
ses -0.2116707 1.0000000
You can also just use the vc
object as a matrix. Here we take the diagonal of it
diag( vc )
(Intercept) ses
2.6953203 0.4530494
If you want an element from a matrix use row-column indexing like so:
1,2] vc[
[1] -0.2339045
for row 1 and column 2.
20.3.0.1 The sigma.hat()
and sigma()
methods
If you just want the variances and standard deviations of your random effects, use sigma.hat()
. This also gives you the residual standard deviation as well. The output is a weird object, with a list of things that are themselves lists in it. Let’s examine it. First we look at what the whole thing is:
sigma.hat( M1 )
$sigma
$sigma$data
[1] 6.065939
$sigma$id
(Intercept) ses
1.6417431 0.6730894
$cors
$cors$data
[1] NA
$cors$id
(Intercept) ses
(Intercept) 1.0000000 -0.2116707
ses -0.2116707 1.0000000
names( sigma.hat( M1 ) )
[1] "sigma" "cors"
sigma.hat( M1 )$sigma
$data
[1] 6.065939
$id
(Intercept) ses
1.6417431 0.6730894
Our standard deviations of the random effects are
sigma.hat( M1 )$sigma$id
(Intercept) ses
1.6417431 0.6730894
We can get our residual variance by this weird thing (we are getting data
from the sigma
inside of sigma.hat( M1 )
):
sigma.hat( M1 )$sigma$data
[1] 6.065939
But here is an easier way using the sigma()
utility function:
sigma( M1 )
[1] 6.065939
20.4 Obtaining Empirical Bayes Estimates of the Random Effects
Random effects come out of the ranef()
method. Each random effect is its own object inside the returned object. You refer to these sets of effects by name. Here our random effect is called id
.
= ranef( M1 )$id
ests head( ests )
(Intercept) ses
1224 -0.26204371 0.08765385
1288 0.03805199 0.11841938
1296 -1.91525901 0.03572247
1308 0.30485682 -0.10500515
1317 -1.15834807 -0.10815301
1358 -0.98212459 0.44612877
Generally, what you get back from these calls is a new data frame with a row for each group. The rows are named with the original id codes for the groups, but if you want to connect it back to your group-level information you are going to want to merge stuff. To do this, and to keep things organized, I recommend adding the id as a column to your dataframe:
names(ests) = c( "u0", "u1" )
$id = rownames( ests )
estshead( ests )
u0 u1 id
1224 -0.26204371 0.08765385 1224
1288 0.03805199 0.11841938 1288
1296 -1.91525901 0.03572247 1296
1308 0.30485682 -0.10500515 1308
1317 -1.15834807 -0.10815301 1317
1358 -0.98212459 0.44612877 1358
We also renamed our columns of our dataframe to give them names nicer than (Intercept)
. You can use these names if you wish, however. You just need to quote them with back ticks (this code is not run):
head( ests$`(Intercept)` )
20.4.1 The coef()
method
We can also get a slighly different (but generally easier to use) version these things through coef()
. What coef()
does is give you the estimated regression lines for each group in your data by combining the random effect for each group with the corresponding fixed effects. Note how in the following the meanses
coefficient is the same, but the others vary due to the random slope and random intercept.
= coef( M1 )$id
coefs head( coefs )
(Intercept) ses meanses
1224 12.38926 2.278004 3.781218
1288 12.68935 2.308769 3.781218
1296 10.73604 2.226072 3.781218
1308 12.95616 2.085345 3.781218
1317 11.49295 2.082197 3.781218
1358 11.66918 2.636479 3.781218
Note that if we have level 2 covariates in our model, they are not incorperated in the intercept and slope via coef()
. We have to do that by hand:
names( coefs ) = c( "beta0.adj", "beta.ses", "beta.meanses" )
$id = rownames( coefs )
coefs= merge( coefs, sdat, by="id" )
coefs = mutate( coefs, beta0 = beta0.adj + beta.meanses * meanses )
coefs $beta.meanses = NULL coefs
Here we added in the impact of mean ses to the intercept (as specified by our model). Now if we look at the intercepts (the beta0 variables) they will incorperate the level 2 covariate effects. If we then plotted a line using beta0 and beta.ses for each school, we would get the estimated lines for each school including the school-level covariate impacts.
20.5 Obtaining standard errors
We can get an object with all the standard errors of the coefficients, including the individual Emperical Bayes estimates for the individual random effects. This is a lot of information. We first look at the Standard Errors for the fixed effects, and then for the random effects. Standard errors for the variance terms are not given (this is tricker to calculate).
20.5.1 Fixed effect standard errors
= se.coef( M1 )
ses names( ses )
[1] "fixef" "id"
Our fixed effect standard errors:
$fixef ses
[1] 0.1506106 0.1218474 0.3826085
You can also get the uncertainty estimates of your fixed effects as a variance-covariance matrix:
vcov( M1 )
3 x 3 Matrix of class "dpoMatrix"
(Intercept) ses meanses
(Intercept) 0.022683560 -0.001465374 -0.001619405
ses -0.001465374 0.014846788 -0.011954182
meanses -0.001619405 -0.011954182 0.146389293
The standard errors are the diagonal of this matrix, square-rooted. See how they line up?:
sqrt( diag( vcov( M1 ) ) )
(Intercept) ses meanses
0.1506106 0.1218474 0.3826085
20.5.2 Random effect standard errors
Our random effect standard errors for our EB estimates:
head( ses$id )
(Intercept) ses
1224 0.7845859 0.5804186
1288 0.9819216 0.6277115
1296 0.7779963 0.5766319
1308 1.0911690 0.6556607
1317 0.8045695 0.6188535
1358 0.9163545 0.6173954
Warning: these come as a matrix, not data frame. It is probably best to do this:
= as.data.frame( se.coef( M1 )$id )
SEs head( SEs )
(Intercept) ses
1224 0.7845859 0.5804186
1288 0.9819216 0.6277115
1296 0.7779963 0.5766319
1308 1.0911690 0.6556607
1317 0.8045695 0.6188535
1358 0.9163545 0.6173954
20.6 Generating confidence intervals
We can compute profile confidence intervals (warnings have been suppressed)
confint( M1 )
2.5 % 97.5 %
.sig01 1.4012799 1.8897548
.sig02 -0.8761947 0.1946551
.sig03 0.2165284 0.9849953
.sigma 5.9659922 6.1689341
(Intercept) 12.3559620 12.9462385
ses 1.9512025 2.4296954
meanses 3.0278220 4.5329237
20.7 Obtaining fitted values
Fitted values are the predicted value for each individual given the model.
= fitted( M1 )
yhat head( yhat )
1 2 3 4 5 6
7.290105 9.431429 9.568109 9.249189 10.410971 10.821011
Residuals are the difference between predicted and observed:
= resid( M1 )
resids head( resids )
1 2 3 4 5 6
-1.4141055 10.2765710 10.7808908 -0.4681887 7.4870293 -6.2380113
We can also predict for hypothetical new data. Here we predict the outcome for a random student with ses of -1, 0, and 1 in a school with mean ses of 0:
= data.frame( ses = c( -1, 0, 1 ), meanses=c(0,0,0), id = -1 )
ndat predict( M1, newdata=ndat, allow.new.levels=TRUE )
1 2 3
10.46095 12.65130 14.84165
The allow.new.levels=TRUE
bit says to predict for a new school (our fake school id of -1 in ndat
above). In this case it assumes the new school is typical, with 0s for the random effect residuals.
If we predict for a current school, the random effect estimates are incorporated:
$id = 1296
ndatpredict( M1, newdata=ndat )
1 2 3
8.509969 10.736041 12.962114
20.8 Appendix: the guts of the object
When we fit our model and store it in a variable, R stores a lot of stuff. The following lists some other functions that pull out bits and pieces of that stuff.
First, to get the model matrix (otherwise called the design matrix)
= model.matrix( M1 )
mm head( mm )
(Intercept) ses meanses
1 1 -1.528 -0.428
2 1 -0.588 -0.428
3 1 -0.528 -0.428
4 1 -0.668 -0.428
5 1 -0.158 -0.428
6 1 0.022 -0.428
This can be useful for predicting individual group mean outcomes, for example.
We can also ask questions such as number of groups, number of individuals:
ngrps( M1 )
id
160
nobs( M1 )
[1] 7185
We can list all methods for the object (merMod
is a more generic version of lmerMod
and has a lot of methods we can use)
class( M1 )
[1] "lmerModLmerTest"
attr(,"package")
[1] "lmerTest"
methods(class = "lmerMod")
[1] coerce coerce<- contest contest1D contestMD display
[7] getL mcsamp se.coef show sim standardize
see '?methods' for accessing help and source code
methods(class = "merMod")
[1] anova as.function coef confint cooks.distance
[6] deviance df.residual display drop1 extractAIC
[11] extractDIC family fitted fixef formula
[16] fortify getData getL getME hatvalues
[21] influence isGLMM isLMM isNLMM isREML
[26] logLik mcsamp model.frame model.matrix ngrps
[31] nobs plot predict print profile
[36] ranef refit refitML rePCA residuals
[41] rstudent se.coef show sigma.hat sigma
[46] sim simulate standardize summary terms
[51] update VarCorr vcov weights
see '?methods' for accessing help and source code