This brief handout will walk through fixed effects and cluster robust standard errors, with a stop at aggregation and heteroskedastic standard errors. We do regression using lm_robust() from the estimatr package.
Consider the following research question as our motiviating question:
RQ: Are there differences in math achievement for Catholic vs public schools, controlling for differences in SES?
36.1 Aggregation
One way forward is to aggregate our HS&B data and merge it into our school-level data, and then analyze the result.
We aggregate as so:
col.dat = dat %>%group_by( id ) %>%summarize( per.fem =mean(female),per.min =mean(minority),mean.ses =mean(ses),mean.ach =mean(mathach),n.stud =n() )# combine our school-level variables (ours and theirs) into one data.framesdat =merge( sdat, col.dat, by="id", all=TRUE )head( sdat )
The lm_robust() method gives heteroskedastic robust standard errors that take into account possible heteroskedasticity due to, for example, some school outcomes being based on smaller numbers of students (and thus having more variation) than other school outcomes.
In this regression we are controlling for school mean SES, not student SES. If anything is going on within school between SES and math achievement, in a way that could be different for different sectors, we might be missing it.
36.2 Student level regression and cluster robust SES
Instead of using our aggregated data, we can merge our school-level variables into the student data and run a student level regression:
The merge brings in level 2 variables, repeating them for each student in a school:
And now our regression (without handling our clustering, so this is giving us wrong standard errors):
Mstud =lm( mathach ~1+ sector + ses, data = dat )summary( Mstud )
Call:
lm(formula = mathach ~ 1 + sector + ses, data = dat)
Residuals:
Min 1Q Median 3Q Max
-20.1339 -4.6785 0.1504 4.8894 16.5847
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.79325 0.10610 111.15 <2e-16 ***
sector 1.93501 0.15249 12.69 <2e-16 ***
ses 2.94856 0.09783 30.14 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 6.345 on 7182 degrees of freedom
Multiple R-squared: 0.1492, Adjusted R-squared: 0.149
F-statistic: 629.8 on 2 and 7182 DF, p-value: < 2.2e-16
36.2.1 Getting the right standard errors
The standard errors for the above regression, however, is wrong: we are not taking the clustering into account. We can fix this with cluster-robust standard errors. The lm_robust() method comes to the rescue:
Call:
lm_robust(formula = mathach ~ 1 + sector + ses, data = dat, clusters = dat$id)
Standard error type: CR2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 11.793 0.2038 57.854 1.608e-69 11.388 12.199 84.12
sector 1.935 0.3185 6.076 1.082e-08 1.305 2.565 141.46
ses 2.949 0.1285 22.951 4.478e-48 2.694 3.203 132.91
Multiple R-squared: 0.1492 , Adjusted R-squared: 0.149
F-statistic: 351.6 on 2 and 159 DF, p-value: < 2.2e-16
We specify the clustering and lm_robust() does the rest; note that we would normally not even run the original lm() command. This replaces it.
For our research question, we see that Catholic schools score 2 points higher than Public, on average, beyond individual level SES.
We can further control for school mean SES, like with aggregation:
Mstud2 =lm_robust( mathach ~1+ sector + ses + meanses, data = dat,cluster = dat$id )summary( Mstud2 )
Call:
lm_robust(formula = mathach ~ 1 + sector + ses + meanses, data = dat,
clusters = dat$id)
Standard error type: CR2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 12.098 0.1712 70.654 6.309e-75 11.7573 12.439 81.55
sector 1.280 0.3022 4.237 5.240e-05 0.6804 1.880 95.16
ses 2.191 0.1299 16.874 1.246e-35 1.9344 2.448 140.83
meanses 2.973 0.3683 8.072 8.367e-12 2.2391 3.706 75.50
Multiple R-squared: 0.17 , Adjusted R-squared: 0.1696
F-statistic: 246.2 on 3 and 159 DF, p-value: < 2.2e-16
The contextual value of school mean SES is explaining some of the difference between Catholic and public schools, here: note the reduction of the coefficient for sector. That being said, and still accounting for clustering, sector is still quite significant. The lm_robust() function is also giving us confidence intevals, which is nice: we see anything between 0.7 and 1.9 is possible.
Relative to the overall standard deviation of math achievement we have:
The difference between Catholic and public schools is somewhere between 0.10 and 0.27 standard deviations, beyond what can be explained by ses. This is a fairly sizable effect, in education.
36.3 And fixed effects?
We can combine fixed effects and cluster robust standard errors quite easily, but we cannot combine fixed effects and level 2 covariates at all. We next look at this latter problem, and then see what combining these options looks like when asking questions that do not rely on level 2 variables for main effects.
36.3.1 The problem of fixed effects and dummy variables
Fixed effects cannot be used to take into account school differences if we are interested in level 2 variables, because the fixed effects and level 2 variables are co-linear. Put another way, if we let each school have its own mean outcome (represented by the coefficient for a dummy variable for that school), then we can’t have a variable like sector to measure how Catholic schools are different from public schools, conditioned on all the school mean outcomes. There is nothing left to explain as, by construction, there are no differences in school mean outcomes once we “control for” the individual school mean outcomes via fixed effects!
What R will do when you give colinear variables is drop the extra ones. Here is a mini-example fake dataset of 4 schools with 3 students in each school:
And our regression model with fixed effects for school plus our school-level ses gives this:
lm( mathach ~0+ id + ses + sector, data = fake )
Call:
lm(formula = mathach ~ 0 + id + ses + sector, data = fake)
Coefficients:
id1 id2 id3 id4 ses sector
0.04004 0.25226 -0.41593 0.25026 0.58576 NA
Note the NA for sector! We cannot estimate it due to colinearity, so it got dropped.
36.3.2 Using fixed effects to handle clustering
That being said, fixed effects are an excellent way to control for school differences when looking at within-school relationships. For example, we can ask how math relates to SES within schools, controlling for systematic differences across schools.
Here is the no fixed effect regression, and the fixed effect regression:
For our fixed effect model, we will have lots of coefficients because we have a fixed effect for each school; the head() command is just showing us the first few. We also had to explicitly make our id variable a factor (categorical variable), so R doesn’t think it is a continuous covariate.
For our standard errors, etc., we can further account for clustering of our residuals above and beyond what can be explained by our fixed effects (even if we subtract out the mean outcome, we might still have dependencies between students within a given school). So we use our cluster-robust standard errors as so:
==================================================
No FE FE FE + CRVE
--------------------------------------------------
(Intercept) 12.75 ***
(0.08)
ses 3.18 *** 2.19 *** 2.19 ***
(0.10) (0.11) (0.13)
--------------------------------------------------
R^2 0.13 0.83 0.83
Adj. R^2 0.13 0.82 0.82
Num. obs. 7185 7185 7185
RMSE 6.08
N Clusters 160
==================================================
*** p < 0.001; ** p < 0.01; * p < 0.05
A few things to note:
Not having fixed effects means we are getting an estimate of the math-ses relationship including school level context. Note the higher point estimate. Often we want to focus on within-school relationships. Fixed effects does this.
The standard errors are larger once we include fixed effects; the fixed effects are partially accounting for clustering.
The standard errors are even larger when we include CRVE. It is more fully accounting for the clustering, and the fact that the clusters themselves could vary. In general, one should typically use CRVE in addition to fixed effects, if one wants to view the clusters as representative of a larger population (in this case a larger population of schools).
36.3.3 Bonus: Interactions with level-2 variables are OK, even with fixed effects
If we want to see if the relationship of math and SES is different between schools, we can get tricky like so:
Mstud4 =lm_robust( mathach ~0+ ses + ses:sector + id, data=dat,cluster = id )head( coef( Mstud4 ) )