This document demonstrates different ways of making regression tables in your reports, and talks about some weird wrinkles with using them with multilevel modeling.
7.1 The basics of regression tables
For the basics we quickly illustrate regression tables using a subset of the Making Caring Common dataset, which we will eventually discuss in class. This dataset has a measure of emotional safety (our outcome) and we want to see, in a specific school, if this is predicted by gender and/or grade.
For our multilevel examples, we use the Making Caring Common data from Project A, and fit data to the 8th grade students only, but do it for all schools. We have made a High School dummy variable.
Our two models we use for demo purposes have a HS term and no HS term:
In the next sections we first show how to get better summary output (according to some folks) and then we walk through making regression tables in a bit more detail than above.
7.3 Getting p-values for lmer output
The lmerTest package is a way of making R give you more complete output. We are going to load it, and then put the new lmer models into new variables so we can see how the different model fitting packages work with the regression table packages below.
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: esafe ~ 1 + HS + (1 | ID)
Data: dat.g8
REML criterion at convergence: 2746.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.3883 -0.6156 0.2021 0.7628 1.7331
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 0.04809 0.2193
Residual 0.46459 0.6816
Number of obs: 1305, groups: ID, 26
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 3.52798 0.08637 29.91033 40.846 <2e-16 ***
HSTRUE -0.29480 0.10787 25.77814 -2.733 0.0112 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
HSTRUE -0.801
7.4 The texreg package
In texreg there are two primary functions for table making, one is screenreg() and the other is texreg().
7.4.1 Using screenreg()
Screenreg is fine for MLMs. It looks a bit like raw output, but it is clear and clean. It will take models fit using lmer or lmerTest, no problem.
screenreg(list(modA,modB))
===============================================
Model 1 Model 2
-----------------------------------------------
(Intercept) 3.35 *** 3.53 ***
(0.06) (0.09)
HSTRUE -0.29 **
(0.11)
-----------------------------------------------
AIC 2756.78 2754.79
BIC 2772.30 2775.49
Log Likelihood -1375.39 -1373.40
Num. obs. 1305 1305
Num. groups: ID 26 26
Var: ID (Intercept) 0.07 0.05
Var: Residual 0.46 0.46
===============================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Comment: Note that the number of stars are different for the display vs the summary output! (Look at the HS coefficient for example.) Not good, it would seem.
This is because the \(p\)-values are calculated using the normal approximation by the screenreg command, and using the \(t\)-test with approximate degrees of freedom by lmerTest. This makes a difference. Consider the following, using the \(t\) statistics for the HS variable:
2*pt( -2.733, df=25.77814 )
[1] 0.0111831
2*pnorm( -2.733 )
[1] 0.006276033
One is below 0.01, and one is not. An extra star!
7.4.2 Using texreg() and TeX
The texreg command is part of the texreg package and can be integrated with latex (which you would need to install). Once you do this, when you compile to a pdf, all is well. In the R code chunk you need to include results="asis" to get the latex to compile right. E.g., “r, results="asis"” when you declare a code chunk.
texreg(list(modA,modB), table=FALSE)
Note that the table=FALSE puts the table right where you want it, not at some random spot latex things is nice. Latex likes to have “floating tables,” where it puts the table where there is space; this makes it easier to make the entire formatted page look nice.
One issue is stargazer does not include the random effect variances, so the output is quite limited for multilevel modeling. It also has less stringent conditions for when to put down stars. One star is below 0.10, two is below 0.05, and three is below 0.01. This is quite generous. Also it is using the normal approximation.
7.5.1 Stargazer with lmerTest
Stargazer with lmerTest is a bit fussy. This shows how to make it work if you have loaded the lmerTest package. Recall the lmerTest package makes your lmer commands have p-values and whatnot. But this means your new lmer() command is not quite the same as the old—and stargazer is expecting the old. You gix this by lying to R, telling it the new thing is the old thing. This basically works.
Now for stargazer, we need to tell it that our models are the right type. First note:
One function, tab_model from sjPlot, makes nice regression tables:
# tabulate the results of our two tip modelslibrary( sjPlot )tab_model(modA.T, modB.T)
esafe
esafe
Predictors
Estimates
CI
p
Estimates
CI
p
(Intercept)
3.35
3.23 – 3.46
<0.001
3.53
3.36 – 3.70
<0.001
HSTRUE
-0.29
-0.51 – -0.08
0.006
Random Effects
σ2
0.46
0.46
τ00
0.07 ID
0.05 ID
ICC
0.12
0.09
N
26 ID
26 ID
Observations
1305
1305
Marginal R2 / Conditional R2
0.000 / 0.125
0.028 / 0.119
7.6 Pretty ANOVA (liklihood ratio test) Tables
We now turn to how to give a nice display of likelihood ratio test results using the kable function. To show how to make such tables, we first create a fake illustrative data set called a that has 100 observations and specifies our outcome Y as a funciton of two uncorrelated variables A and B
a <-tibble( A =rnorm( 100 ),B =rnorm( 100 ),Y = A *0.2+ B *0.5+rnorm( 100, 0, 1 ) )
7.6.1 Run the Models
We fit two models, one with A and B, the other with just A.
M1 <-lm( Y~ A + B, data = a )M2 <-lm( Y ~ A, data = a )
7.6.2 Comparing the Models
We use the anova function to compare the two models (see also the chapter on Likelihood Ratio tests). We see that B improves the model fit significantly.
aa =anova( M2, M1 )aa
Analysis of Variance Table
Model 1: Y ~ A
Model 2: Y ~ A + B
Res.Df RSS Df Sum of Sq F Pr(>F)
1 98 130.54
2 97 109.03 1 21.503 19.13 3.074e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
aa |>tidy() |>kable()
term
df.residual
rss
df
sumsq
statistic
p.value
Y ~ A
98
130.5386
NA
NA
NA
NA
Y ~ A + B
97
109.0352
1
21.50338
19.12986
3.07e-05
7.6.3 Compare to the Significance test on B
Note that the p value for B is identical to the ANOVA results above. Why bother with ANOVA? It can test more complex hypotheses as well (multiple coefficients, random effects, etc.)
M1 |>tidy() |>kable()
term
estimate
std.error
statistic
p.value
(Intercept)
0.1331894
0.1063276
1.252633
0.2133507
A
0.2320947
0.1037421
2.237229
0.0275605
B
0.4375799
0.1000464
4.373769
0.0000307
7.6.4 Acknowledgements
The ANOVA portion of this chapter was initially drafted by Josh Gilbert