Event-study DiD with staggered treatment design
August 2, 2020
This is another post in my series of attempts to learn the data.table package for R and to get more comfortable with base R graphics. Today, I reimplement the data generating process and one of the models in this excellent blog post by Andrew Baker: How to create relative time indicators. I also strongly recommend watching Andrew’s talk on Difference-in-Differences staggered treatment designs.
For this exercise, we will only use two libraries:
package ‘data.table’ was built under R version 4.5.2
As in the original post, the data generating process includes unit and time fixed effects, as well as unit-specific treatment effects.
$$y_{it}=\alpha_i + \alpha_t + \tau_{it} + \varepsilon_{it}$$ $$\alpha_i,\alpha_t \sim N(0,1)$$ $$\varepsilon_{it} \sim N(0,0.5)$$ $$\mu_{it} \sim N(0.3,0.2^2)$$
and $\tau_{it}$ is equal to the sum of each firm-specific $\mu_{it}$ since the start of the treatment period.
We have 1000 firms split into 4 cohorts which receive treatment in 1986, 1992, 1998, and 2004.
{
# unit fixed effects
unit =
# year fixed effects
year =
# treatment groups by state
treat_taus =
# full interaction of unit X year
out =
out =
out =
out =
# error term, treatment indicator, and treatment effects
out
# cumulative treatment effects
out
# dependent variable
out
return(out)
}
Plot data from a single simulation
dat =
# empty plot window
# one line per firm
for (i in 1:1000) {
}
# means by treatment cohort
cohorts = dat
years <-
for (y in years) {
}
# simulate data
dat =
# drop 2004 treatment
dat = dat
# years since/to treatment
dat
> Key:
> state year unit unit_fe mu year_fe cohort_year
>
> 1: 17 1980 1 -0.8350648 0.3592835 0.58993428 2004
> 2: 17 1981 1 -0.8350648 0.3592835 -0.05174812 2004
> 3: 17 1982 1 -0.8350648 0.3592835 0.61722359 2004
> 4: 17 1983 1 -0.8350648 0.3592835 -1.02623512 2004
> 5: 17 1984 1 -0.8350648 0.3592835 -0.04833219 2004
> ---
> 23996: 2 1999 1000 -0.6987034 0.2791671 0.70941625 1986
> 23997: 2 2000 1000 -0.6987034 0.2791671 -1.32852478 1986
> 23998: 2 2001 1000 -0.6987034 0.2791671 -0.04398506 1986
> 23999: 2 2002 1000 -0.6987034 0.2791671 1.01043319 1986
> 24000: 2 2003 1000 -0.6987034 0.2791671 0.10484482 1986
> error treat tau tau_cum dep_var rel_year
>
> 1: 0.1527800 0 0.0000000 0.000000 -0.09235047 -24
> 2: -0.4113322 0 0.0000000 0.000000 -1.29814509 -23
> 3: 0.8993860 0 0.0000000 0.000000 0.68154483 -22
> 4: -0.6836207 0 0.0000000 0.000000 -2.54492061 -21
> 5: -0.2962390 0 0.0000000 0.000000 -1.17963595 -20
> ---
> 23996: 1.0399516 1 0.2791671 3.908340 4.95900410 13
> 23997: 0.5727518 1 0.2791671 4.187507 2.73303045 14
> 23998: 0.2524643 1 0.2791671 4.466674 3.97644978 15
> 23999: 0.2064968 1 0.2791671 4.745841 5.26406762 16
> 24000: -0.1569725 1 0.2791671 5.025008 4.27417715 17
min_year = dat
max_year = dat
# identification requires dropping two dummies
# dat[, rel_year := fifelse(!rel_year %in% c(min_year, -1), as.character(rel_year), "omitted")][
# , rel_year := factor(rel_year)][
# , rel_year := relevel(rel_year, ref=c("omitted")]
dat
> Key:
> Index:
> state year unit unit_fe mu year_fe cohort_year
>
> 1: 17 1980 1 -0.8350648 0.3592835 0.58993428 2004
> 2: 17 1981 1 -0.8350648 0.3592835 -0.05174812 2004
> 3: 17 1982 1 -0.8350648 0.3592835 0.61722359 2004
> 4: 17 1983 1 -0.8350648 0.3592835 -1.02623512 2004
> 5: 17 1984 1 -0.8350648 0.3592835 -0.04833219 2004
> ---
> 23996: 2 1999 1000 -0.6987034 0.2791671 0.70941625 1986
> 23997: 2 2000 1000 -0.6987034 0.2791671 -1.32852478 1986
> 23998: 2 2001 1000 -0.6987034 0.2791671 -0.04398506 1986
> 23999: 2 2002 1000 -0.6987034 0.2791671 1.01043319 1986
> 24000: 2 2003 1000 -0.6987034 0.2791671 0.10484482 1986
> error treat tau tau_cum dep_var rel_year
>
> 1: 0.1527800 0 0.0000000 0.000000 -0.09235047 -24
> 2: -0.4113322 0 0.0000000 0.000000 -1.29814509 -23
> 3: 0.8993860 0 0.0000000 0.000000 0.68154483 -22
> 4: -0.6836207 0 0.0000000 0.000000 -2.54492061 -21
> 5: -0.2962390 0 0.0000000 0.000000 -1.17963595 -20
> ---
> 23996: 1.0399516 1 0.2791671 3.908340 4.95900410 13
> 23997: 0.5727518 1 0.2791671 4.187507 2.73303045 14
> 23998: 0.2524643 1 0.2791671 4.466674 3.97644978 15
> 23999: 0.2064968 1 0.2791671 4.745841 5.26406762 16
> 24000: -0.1569725 1 0.2791671 5.025008 4.27417715 17
# regression model
f = dep_var ~ rel_year | unit + year
mod =
# clean results
out =
out = out
out = out
Plot results
# event study estimates
# truth
truth =
truth
> term estimate
>
> 1: -5 0.0
> 2: -4 0.0
> 3: -3 0.0
> 4: -2 0.0
> 5: -1 0.0
> 6: 0 0.3
> 7: 1 0.6
> 8: 2 0.9
> 9: 3 1.2
> 10: 4 1.5
> 11: 5 1.8
Loading source...