library(tidyverse)
library(knitr)
library(broom)
8 Pretty ANOVA Tables with kable
8.1 R Setup
We load the tidyverse
and knitr
. The kable
function from knitr
makes our tables look nice!
8.2 Create fake data
We create a data set called a
that has 100 observations and specifies our outcome Y
as a funciton of two uncorrelated variables A
and B
<- tibble( A = rnorm( 100 ),
a B = rnorm( 100 ),
Y = A * 0.2 + B * 0.5 + rnorm( 100, 0, 1 ) )
8.3 Run the Models
We fit two models, one with A
and B
, the other with just A
.
<- lm( Y~ A + B, data = a )
M1 <- lm( Y ~ A, data = a ) M2
8.4 Comparing the Models
We use the anova
function to compare the two models (see also the chapter on Likelihood Ratio tests). We see that B
improves the model fit significantly.
= anova( M2, M1 )
aa aa
Analysis of Variance Table
Model 1: Y ~ A
Model 2: Y ~ A + B
Res.Df RSS Df Sum of Sq F Pr(>F)
1 98 133.925
2 97 94.361 1 39.563 40.67 6.125e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
|>
aa tidy() |>
kable()
term | df.residual | rss | df | sumsq | statistic | p.value |
---|---|---|---|---|---|---|
Y ~ A | 98 | 133.92466 | NA | NA | NA | NA |
Y ~ A + B | 97 | 94.36139 | 1 | 39.56327 | 40.66957 | 0 |
8.5 Compare to the Significance test on B
Note that the p value for B
is identical to the ANOVA results above. Why bother with ANOVA? It can test more complex hypotheses as well (multiple coefficients, random effects, etc.)
|>
M1 tidy() |>
kable()
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 0.0029046 | 0.0993348 | 0.0292405 | 0.9767329 |
A | 0.2019888 | 0.0969365 | 2.0837227 | 0.0398145 |
B | 0.7062818 | 0.1107499 | 6.3772698 | 0.0000000 |