29 Likelihood Ratio Tests

Author

Josh Gilbert

In this chapter we give an overview of using likelihood ratio tests to assess whether a more complex model is justified by the data, as compared to a simpler model.

We will use our old friend HS&B to illustrate. We first load the data:

# load libraries
library(tidyverse)
library(lme4)
library(haven)
library(sjPlot)

# load HSB data
hsb <- read_dta("data/hsb.dta") |> 
  select(mathach, ses, schoolid)

29.1 Why LR Tests?

Our fixed effects coefficients have SEs, z-statistics, and p-values, which allow us to easily test the null hypothesis that the slopes are 0 in the population. No such quantities, however, are provided for the random effects of our model. We can use LR tests to address this issue and test the statistical significance of the various random portions of our model.

We can also use LR tests on fixed effects or sets of fixed effects (like a nested F-test in OLS), but 99.9% of the time, the conclusion will be the same as using the z-statistics.

LR tests require that the models are nested, meaning that they use the same data, and one model can be expressed as a constrained version of the other.

29.2 HSB Example

We fit 3 models:

Random intercept model
Random slope model with no correlation between intercepts and slopes (you probably have not seen this before, but we will use it to test whether the slopes are correlated with the intercepts).
Random slope model

We can see from the model output that the point estimates for the random slope variance \(\tau_{11}\) and the correlation \(\rho_{01}\) are non-zero, but how can we get p-values for these quantities?

m1 <- lmer(mathach ~ ses + (1|schoolid), hsb)
m2 <- lmer(mathach ~ ses + (1|schoolid)+(0+ses|schoolid), hsb)
m3 <- lmer(mathach ~ ses + (1+ses|schoolid), hsb)

tab_model(m1, m2, m3,
          p.style = "stars",
          show.se = TRUE,
          show.ci = FALSE,
          dv.labels = c("RI", "No Rho", "RS"))

	RI		No Rho		RS
Predictors	Estimates	std. Error	Estimates	std. Error	Estimates	std. Error
(Intercept)	12.66 ^***	0.19	12.65 ^***	0.19	12.67 ^***	0.19
ses	2.39 ^***	0.11	2.40 ^***	0.12	2.39 ^***	0.12
Random Effects
σ²	37.03		36.82		36.83
τ₀₀	4.77 _schoolid		4.85 _schoolid		4.83 _schoolid
τ₁₁			0.42 _schoolid.ses		0.41 _schoolid.ses
ρ₀₁					-0.11 _schoolid
ICC	0.11		0.12		0.12
N	160 _schoolid		160 _schoolid		160 _schoolid
Observations	7185		7185		7185
Marginal R² / Conditional R²	0.077 / 0.182		0.077 / 0.185		0.077 / 0.189
* p<0.05 p<0.01 * p<0.001

29.2.1 Are random `ses` slopes necessary?

We use anova to perform the LR test comparing m1 and m3 to see if we need random slopes. We see that the random slopes are not statistically significant.

anova(m1, m3)

Data: hsb
Models:
m1: mathach ~ ses + (1 | schoolid)
m3: mathach ~ ses + (1 + ses | schoolid)
   npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)
m1    4 46649 46677 -23320    46641                     
m3    6 46648 46690 -23318    46636 4.5354  2     0.1035

29.2.2 Is there a correlation between the random intercept and slope for `ses`?

We next can compare model 2 and model 3 to see if the correlation is needed, given the random slope model. We see it is not:

anova(m2, m3)

Data: hsb
Models:
m2: mathach ~ ses + (1 | schoolid) + (0 + ses | schoolid)
m3: mathach ~ ses + (1 + ses | schoolid)
   npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)
m2    5 46647 46681 -23318    46637                     
m3    6 46648 46690 -23318    46636 0.2762  1     0.5992

29.3 Technical Notes

TL/DR: The traditional LR test provided by anova is likely to be conservative for testing the significance of variance components. For the purposes of this course, it is fine.

There is a lot of multilevel literature arguing that testing a null hypothesis on variance components with LR tests is not the best approach. The reason is that variances cannot be negative, so the null hypothesis exists on the “boundary of the parameter space” and therefore is “likely to be conservative” (to use the warning that Stata gives you, i.e., the p-values are too high). The true distribution of a 0 variance component is not a normal distribution, but a mixture distribution with half of the probability mass at 0 and the other half \(\chi^2\). When you’re testing the significance of the random intercepts model, you can divide the p-value by 2 to get the right answer (Stata default), though for more complex models it’s not so simple. There are some R packages that use simulation-based approaches to provide more robust results such as pbkrtest::PBmodcomp, but we won’t go into them here. See RH&S pp. 88-89 for a more thorough discussion of this issue. Despite this, the standard LR test remains common in practice.

29.1 Why LR Tests?

29.2 HSB Example

29.2.1 Are random ses slopes necessary?

29.2.2 Is there a correlation between the random intercept and slope for ses?

29.3 Technical Notes

29.2.1 Are random `ses` slopes necessary?

29.2.2 Is there a correlation between the random intercept and slope for `ses`?