From 48016f8ecb1783ec82c67a856f1b2a965fbd4bef Mon Sep 17 00:00:00 2001 From: grantmcdermott Date: Fri, 1 Dec 2023 21:12:01 +0000 Subject: [PATCH] =?UTF-8?q?Deploying=20to=20gh-pages=20from=20@=20grantmcd?= =?UTF-8?q?ermott/etwfe@6e006a6b43d705c31cad5b976cf74b9a6b42fe78=20?= =?UTF-8?q?=F0=9F=9A=80?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- articles/etwfe.html | 638 ++++++++++++++++++++++---------------------- news/index.html | 2 +- pkgdown.yml | 2 +- search.json | 2 +- 4 files changed, 322 insertions(+), 322 deletions(-) diff --git a/articles/etwfe.html b/articles/etwfe.html index f7d5cf1..5337458 100644 --- a/articles/etwfe.html +++ b/articles/etwfe.html @@ -351,23 +351,23 @@

Presentation= "Event study", notes = "Std. errors are clustered at the county level" ) -
- @@ -972,23 +972,23 @@

Heterogeneous treatment effects gof_map = NA, title = "Comparing the ATT on GLS and non-GLS counties" )

-
- @@ -1574,23 +1574,23 @@

Performance tips title = "Event study", notes = "Std. errors are clustered at the county level" )

-
- @@ -2108,7 +2108,7 @@

Manual implementationmod$fml_all #> $linear #> lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003)/lpop_dm -#> <environment: 0x559a99a58478> +#> <environment: 0x5642bf413670> #> #> $fixef #> ~first.treat + first.treat[[lpop]] + year + year[[lpop]]

@@ -2145,23 +2145,23 @@

Manual implementation list("etwfe" = mod, "manual" = mod2), gof_map = NA # drop all goodness-of-fit info for brevity ) -
- @@ -2842,23 +2842,23 @@

Regarding fixed effects) modelsummary(mods, gof_map = NA)

-
- @@ -3863,23 +3863,23 @@

Regarding fixed effects coef_rename = rename_fn, gof_omit = "Adj|Within|IC|RMSE" )

-
- diff --git a/news/index.html b/news/index.html index cc6887c..eba2ab4 100644 --- a/news/index.html +++ b/news/index.html @@ -48,7 +48,7 @@
-

etwfe 0.3.5

+

etwfe 0.3.5

CRAN release: 2023-12-01

Internal

  • Update tests to match upstream changes to fixest.
  • diff --git a/pkgdown.yml b/pkgdown.yml index 0610861..28d92d1 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -3,7 +3,7 @@ pkgdown: 2.0.7 pkgdown_sha: ~ articles: etwfe: etwfe.html -last_built: 2023-12-01T07:45Z +last_built: 2023-12-01T21:11Z urls: reference: http://grantmcdermott.com/etwfe/reference article: http://grantmcdermott.com/etwfe/articles diff --git a/search.json b/search.json index 2571183..f679b21 100644 --- a/search.json +++ b/search.json @@ -1 +1 @@ -[{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"background","dir":"Articles","previous_headings":"","what":"Background","title":"Introduction to etwfe","text":"canonical research design social scientists -called “differences--differences” () design.1 classic 2x2 case (two units, two periods), simple interaction effect two dummy variables suffices recover treatment effect. base R might look something like: resulting coefficient Dtreated_unitTRUE:Dpost_treatmentTRUE interaction term represents treatment effect. Rather manually specify interaction term, practice researchers often use equivalent formulation known two-way fixed effects (TWFE). core idea TWFE can subsume interaction term previous code chunk adding unit time fixed effects. single treatment dummy can used capture effect treatment directly. TWFE regression base R might look follows: treatment effect now captured coefficient Dtreat dummy. TWFE shortcut especially nice complicated panel data settings multiple units multiple times periods. Speaking , prefer use dedicated fixed effects / panel data package like fixest, also estimate previous regression like : TWFE regressions easy run intuitive, long time everyone happy. good last. cottage industry clever research now demonstrates things quite simple. Among things, standard TWFE formulation can impose strange (negative) weighting conditions key parts estimation procedure. One implication risk high probability estimate bias presence staggered treatment rollouts, common real-life applications. Fortunately, just econometricians taking away one favourite tools, kind enough replace new ones. Among , proposed approach Wooldridge (2021, 2022) noteworthy. idea might paraphrased stating problem TWFE first place. Rather, ’s weren’t enough. Instead including single treatment × time interaction, Wooldridge recommends saturate model possible interactions treatment time variables, including treatment cohorts, well covariates. goes show approach actually draws equivalence different types estimators (pooled OLS, twoway Mundlak regression, etc.) ’s entirely clear call . Wooldridge refers general idea extended TWFE—, ETWFE—rather like package takes name. Wooldridge ETWFE solution intuitive elegant. also rather tedious error prone code manually. correctly specify possible interactions, demean control variables within groups, recover treatment effects interest via appropriate marginal effect aggregation. etwfe package aims simplify process providing convenience functions work .","code":"lm(y ~ Dtreated_unit * Dpost_treatment, data = somedata) lm(y ~ Dtreat + factor(id) + factor(period), data = somedata) library(fixest) feols(y ~ Dtreat | id + period, data = somedata)"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"Introduction to etwfe","text":"demonstrate core functionality etwfe, ’ll use mpdta dataset US teen employment package (’ll need install separately). “Treatment” dataset refers increase minimum wage rate. examples follow, goal estimate effect minimum wage treatment (treat) log teen employment (lemp). Notice panel ID county level (countyreal), treatment staggered across cohorts (first.treat) group counties treated time. addition staggered treatment effects, also observe log population (lpop) potential control variable.","code":"# install.packages(\"did\") data(\"mpdta\", package = \"did\") head(mpdta) #> year countyreal lpop lemp first.treat treat #> 866 2003 8001 5.896761 8.461469 2007 1 #> 841 2004 8001 5.896761 8.336870 2007 1 #> 842 2005 8001 5.896761 8.340217 2007 1 #> 819 2006 8001 5.896761 8.378161 2007 1 #> 827 2007 8001 5.896761 8.487352 2007 1 #> 937 2003 8019 2.232377 4.997212 2007 1"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"basic-usage","dir":"Articles","previous_headings":"","what":"Basic usage","title":"Introduction to etwfe","text":"Let’s load etwfe work basic functionality. ’ll see, core workflow package involves two consecutive function calls: 1) etwfe() 2) emfx().","code":""},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"etwfe","dir":"Articles","previous_headings":"Basic usage","what":"etwfe","title":"Introduction to etwfe","text":"Given package name, won’t surprise learn key estimating function etwfe(). ’s look example dataset. things say etwfe() argument choices function options, ’ll leave details aside bit later. Right now, just know arguments required except vcov (though generally recommend , since probably want cluster standard errors individual unit level). Let’s take look model object. etwfe() done underneath hood construct treatment dummy variable .Dtreat saturated together variables interest set multiway interaction terms.2 may noticed etwfe() call returns standard fixest object, since uses perform underlying estimation. associated methods functions fixest package thus compatible model object. example, plot raw regression coefficients fixest::coefplot(), print nice regression table fixest::etable(). However, raw coefficients etwfe() estimation particularly meaningful . Recall complex, multiway interaction terms probably hard interpret . insight leads us next key function…","code":"library(etwfe) mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered) ) mod #> OLS estimation, Dep. Var.: lemp #> Observations: 2,500 #> Fixed-effects: first.treat: 4, year: 5 #> Varying slopes: lpop (first.treat: 4), lpop (year: 5) #> Standard-errors: Clustered (countyreal) #> Estimate Std. Error t value #> .Dtreat:first.treat::2004:year::2004 -0.021248 0.021728 -0.977890 #> .Dtreat:first.treat::2004:year::2005 -0.081850 0.027375 -2.989963 #> .Dtreat:first.treat::2004:year::2006 -0.137870 0.030795 -4.477097 #> .Dtreat:first.treat::2004:year::2007 -0.109539 0.032322 -3.389024 #> .Dtreat:first.treat::2006:year::2006 0.002537 0.018883 0.134344 #> .Dtreat:first.treat::2006:year::2007 -0.045093 0.021987 -2.050907 #> .Dtreat:first.treat::2007:year::2007 -0.045955 0.017975 -2.556568 #> .Dtreat:first.treat::2004:year::2004:lpop_dm 0.004628 0.017584 0.263184 #> .Dtreat:first.treat::2004:year::2005:lpop_dm 0.025113 0.017904 1.402661 #> .Dtreat:first.treat::2004:year::2006:lpop_dm 0.050735 0.021070 2.407884 #> .Dtreat:first.treat::2004:year::2007:lpop_dm 0.011250 0.026617 0.422648 #> .Dtreat:first.treat::2006:year::2006:lpop_dm 0.038935 0.016472 2.363731 #> .Dtreat:first.treat::2006:year::2007:lpop_dm 0.038060 0.022477 1.693276 #> .Dtreat:first.treat::2007:year::2007:lpop_dm -0.019835 0.016198 -1.224528 #> Pr(>|t|) #> .Dtreat:first.treat::2004:year::2004 3.2860e-01 #> .Dtreat:first.treat::2004:year::2005 2.9279e-03 ** #> .Dtreat:first.treat::2004:year::2006 9.3851e-06 *** #> .Dtreat:first.treat::2004:year::2007 7.5694e-04 *** #> .Dtreat:first.treat::2006:year::2006 8.9318e-01 #> .Dtreat:first.treat::2006:year::2007 4.0798e-02 * #> .Dtreat:first.treat::2007:year::2007 1.0866e-02 * #> .Dtreat:first.treat::2004:year::2004:lpop_dm 7.9252e-01 #> .Dtreat:first.treat::2004:year::2005:lpop_dm 1.6134e-01 #> .Dtreat:first.treat::2004:year::2006:lpop_dm 1.6407e-02 * #> .Dtreat:first.treat::2004:year::2007:lpop_dm 6.7273e-01 #> .Dtreat:first.treat::2006:year::2006:lpop_dm 1.8474e-02 * #> .Dtreat:first.treat::2006:year::2007:lpop_dm 9.1027e-02 . #> .Dtreat:first.treat::2007:year::2007:lpop_dm 2.2133e-01 #> ... 10 variables were removed because of collinearity (.Dtreat:first.treat::2006:year::2004, .Dtreat:first.treat::2006:year::2005 and 8 others [full set in $collin.var]) #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 0.537131 Adj. R2: 0.87167 #> Within R2: 8.449e-4"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"emfx","dir":"Articles","previous_headings":"Basic usage","what":"emfx","title":"Introduction to etwfe","text":"raw etwfe coefficients aren’t particularly useful , can instead? Well, probably want aggregate along dimension interest (e.g., groups time, event study). natural way perform aggregations recovering appropriate marginal effects. etwfe package provides another convenience function : emfx(), thin(ish) wrapper around marginaleffects::slopes(). example, can recover average treatment effect treated (ATT) follows. words, model telling us increase minimum wage leads approximate 5 percent decrease teen employment. Beyond simple ATTs, emfx() also supports types aggregations via type argument. example, can use type = \"calendar\" get ATTs period, type = \"group\" get ATTs cohort groups. option probably useful people type = \"event\", recover dynamic treatment effects la event study. Let’s try save resulting object, since plan reuse moment. event study suggests teen disemployment effect minimum wage hike fairly modest first (3%), increases next years (>10%). next section, ’ll look ways communicate kind finding audience.","code":"emfx(mod) #> #> Term Contrast .Dtreat Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) TRUE -0.0506 0.0125 -4.05 <0.001 #> S 2.5 % 97.5 % #> 14.3 -0.0751 -0.0261 #> #> Columns: term, contrast, .Dtreat, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response mod_es = emfx(mod, type = \"event\") mod_es #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"presentation","dir":"Articles","previous_headings":"Basic usage","what":"Presentation","title":"Introduction to etwfe","text":"Since emfx() produces standard marginaleffects object, can pass supported methods packages. example, can pass modelsummary get nice regression table event study coefficients. Note use shape coef_rename arguments ; optional help make output look bit nicer. Event study visualization, can pass preferred plotting method. example: Note emfx reports post-treatment effects. pre-treatment effects swept estimation way ETWFE set . fact, pre-treatment effects mechanistically set zero. means ETWFE used interrogating pre-treatment fit (say, visual inspection parallel pre-trends). Still, can get zero pre-treatment effects changing post_only argument. emphasize strictly performative—, pre-treatment effects zero estimation design—might make event study plot aesthetically pleasing.","code":"library(modelsummary) # Quick renaming function to replace \".Dtreat\" with something more meaningful rename_fn = function(old_names) { new_names = gsub(\".Dtreat\", \"Years post treatment =\", old_names) setNames(new_names, old_names) } modelsummary( mod_es, shape = term:event:statistic ~ model, coef_rename = rename_fn, gof_omit = \"Adj|Within|IC|RMSE\", title = \"Event study\", notes = \"Std. errors are clustered at the county level\" ) library(ggplot2) theme_set( theme_minimal() + theme(panel.grid.minor = element_blank()) ) ggplot(mod_es, aes(x = event, y = estimate, ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0) + geom_pointrange(col = \"darkcyan\") + labs(x = \"Years post treatment\", y = \"Effect on log teen employment\") # Use post_only = FALSE to get the \"zero\" pre-treatment effects mod_es2 = emfx(mod, type = \"event\", post_only = FALSE) ggplot(mod_es2, aes(x = event, y = estimate, ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0) + geom_vline(xintercept = -1, lty = 2) + geom_pointrange(col = \"darkcyan\") + labs( x = \"Years post treatment\", y = \"Effect on log teen employment\", caption = \"Note: Zero pre-treatment effects for illustrative purposes only.\" ) #> Warning: Removed 4 rows containing missing values (`geom_segment()`)."},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"heterogeneous-treatment-effects","dir":"Articles","previous_headings":"","what":"Heterogeneous treatment effects","title":"Introduction to etwfe","text":"far ’ve limited homogeneous treatment effects, impact treatment (.e., minimum wage hike) averaged across US counties dataset. However, many research problems require us estimate treatment effects separately across groups , potentially, test differences . example, might want test whether efficacy new vaccine differs across age groups, whether marketing campaign equally successful across genders. ETWFE framework naturally lends kinds heterogeneous treatment effects. Consider following example, first create logical dummy variable US counties eight Great Lake states (GLS). Now imagine interested estimating separate treatment effects GLS versus non-GLS counties. simply invoking optional xvar argument part etwfe() call.3 subsequent emfx() call object automatically recognize want recover treatment effects two distinct groups. point estimates might tempt us think minimum wage hikes caused less teen disemployment GLS counties rest US average. However, test formally can invoke powerful hypothesis infrastructure underlying marginaleffects package. Probably easiest way using b[]-style positional arguments, “[]” denotes row emfx() return object. Thus, specifying hypothesis = \"b1 = b2\", can test whether ATTs row 1 (non-GLS) row 2 (GLS) different one another. see actually statistical difference average disemployment effect GLS non-GLS counties. One final aside can easily display results heterogeneous treatment effects plot table form. ’s example latter, make use modelsummary(..., shape = ...) argument. Comparing ATT GLS non-GLS counties simple example limited binary comparison group ATTs, note logic carries richer settings. can use exact workflow estimate heterogeneous treatment effects different aggregations (e.g., event studies) across groups many levels.","code":"gls_fips = c(\"IL\" = 17, \"IN\" = 18, \"MI\" = 26, \"MN\" = 27, \"NY\" = 36, \"OH\" = 39, \"PA\" = 42, \"WI\" = 55) mpdta$gls = substr(mpdta$countyreal, 1, 2) %in% gls_fips hmod = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, xvar = gls ## <= het. TEs by gls ) # Heterogeneous ATTs (could also specify `type = \"event\"`, etc.) emfx(hmod) #> #> Term Contrast .Dtreat gls Estimate Std. Error z #> .Dtreat mean(TRUE) - mean(FALSE) TRUE FALSE -0.0637 0.0376 -1.69 #> .Dtreat mean(TRUE) - mean(FALSE) TRUE TRUE -0.0472 0.0271 -1.74 #> Pr(>|z|) S 2.5 % 97.5 % #> 0.0906 3.5 -0.137 0.01007 #> 0.0817 3.6 -0.100 0.00594 #> #> Columns: term, contrast, .Dtreat, gls, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response emfx(hmod, hypothesis = \"b1 = b2\") #> #> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> b1=b2 -0.0164 0.0559 -0.294 0.769 0.4 -0.126 0.093 #> #> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high #> Type: response modelsummary( models = list(\"GLS county\" = emfx(hmod)), shape = term + statistic ~ model + gls, # add xvar variable (here: gls) coef_map = c(\".Dtreat\" = \"ATT\"), gof_map = NA, title = \"Comparing the ATT on GLS and non-GLS counties\" )"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"other-families","dir":"Articles","previous_headings":"","what":"Other families","title":"Introduction to etwfe","text":"Another key feature ETWFE approach—one sets apart advanced implementations extensions—supports nonlinear model (distribution / link) families. Users need simply invoke family argument. ’s brief example, recast earlier event-study Poisson regression.","code":"mpdta$emp = exp(mpdta$lemp) etwfe( emp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, family = \"poisson\" ) |> emfx(\"event\") #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) 0 -25.35 15.9 -1.5942 0.111 #> .Dtreat mean(TRUE) - mean(FALSE) 1 1.09 41.8 0.0261 0.979 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -75.12 22.3 -3.3696 <0.001 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -101.82 28.1 -3.6234 <0.001 #> S 2.5 % 97.5 % #> 3.2 -56.5 5.82 #> 0.0 -80.9 83.09 #> 10.4 -118.8 -31.43 #> 11.7 -156.9 -46.75 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"performance-tips","dir":"Articles","previous_headings":"","what":"Performance tips","title":"Introduction to etwfe","text":"Thinking etwfe workflow pair consecutive functional calls, first etwfe() stage tends fast. ’re leveraging incredible performance fixest also taking shortcuts avoid wasting time nuisance parameters. See Regarding fixed effects section details . part, second emfx() stage also tends pretty performant. data less 100k rows, ’s unlikely ’ll wait seconds obtain results. However, emfx’s computation time tend scale non-linearly size original data, well number interactions underlying etwfe model object. Without getting deep weeds, relying numerical delta method (excellent) marginaleffects package underneath hood recover ATTs interest. method requires estimating two prediction models coefficient model computing standard errors. ’s potentially expensive operation can push computation time large datasets (> 1m rows) several minutes longer. Fortunately, two complementary strategies can use speed things . first turn expensive part whole procedure—standard error calculation—calling emfx(..., vcov = FALSE). bring estimation time back seconds less, even datasets excess million rows. course, loss standard errors might acceptable trade-projects statistical inference critical. good news first strategy can still combined second strategy: turns collapsing data groups prior estimating marginal effects can yield substantial speed gains . Users can invoking emfx(..., collapse = TRUE) argument. effect dramatic first strategy, collapsing data virtue retaining information standard errors. trade-time, however, collapsing data lead loss accuracy estimated parameters. hand, testing suggests loss accuracy tends relatively minor, results equivalent 1st 2nd significant decimal place (even better). Summarizing, quick plan attack try worried estimation time large datasets models: Estimate mod = etwfe(...) per usual. Run emfx(mod, vcov = FALSE, ...). Run emfx(mod, vcov = FALSE, collapse = TRUE, ...). Compare point estimates steps 1 2. similar enough satisfaction, get approximate standard errors running emfx(mod, collapse = TRUE, ...). ’s bit performance art, since examples vignette complete quickly anyway. reworking earlier event study example demonstrate performance-conscious workflow. put fine point , can can compare original event study collapsed estimates see results indeed similar. Event study","code":"# Step 0 already complete: using the same `mod` object from earlier... # Step 1 emfx(mod, type = \"event\", vcov = FALSE) #> #> Term Contrast event Estimate #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 #> #> Columns: term, contrast, event, estimate, predicted_lo, predicted_hi, predicted #> Type: response # Step 2 emfx(mod, type = \"event\", vcov = FALSE, collapse = TRUE) #> #> Term Contrast event Estimate #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0216 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0635 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 #> #> Columns: term, contrast, event, estimate, predicted_lo, predicted_hi, predicted #> Type: response # Step 3: Results from 1 and 2 are similar enough, so get approx. SEs mod_es2 = emfx(mod, type = \"event\", collapse = TRUE) modelsummary( list(\"Original\" = mod_es, \"Collapsed\" = mod_es2), shape = term:event:statistic ~ model, coef_rename = rename_fn, gof_omit = \"Adj|Within|IC|RMSE\", title = \"Event study\", notes = \"Std. errors are clustered at the county level\" )"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"under-the-hood","dir":"Articles","previous_headings":"","what":"Under the hood","title":"Introduction to etwfe","text":"Now ’ve seen etwfe action, let’s circle back package hood. section isn’t necessary use package; feel free skip . review internal details help optimize different scenarios also give better understanding etwfe’s default choices.","code":""},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"manual-implementation","dir":"Articles","previous_headings":"Under the hood","what":"Manual implementation","title":"Introduction to etwfe","text":"keep reiterating, ETWFE approach basically involves saturating regression interaction effects. can easily grab formula estimated model see . point, however, may notice things. first formula references several variables aren’t original dataset. obvious one .Dtreat treatment dummy. subtle one lpop_dm, demeaned (.e., group-centered) version lpop control variable. control variables demeaned interacted ETWFE setting. ’s constructed dataset ahead time estimated ETWFE regression manually: can confirm manual approach yields output original etwfe regression. ’ll use modelsummary , since ’ve already loaded .4. transform raw coefficients meaningful ATT counterparts, just need perform appropriate marginal effects operation. example, ’s can get simple ATTs event-study ATTs earlier. emfx() behind scenes.","code":"mod$fml_all #> $linear #> lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003)/lpop_dm #> #> #> $fixef #> ~first.treat + first.treat[[lpop]] + year + year[[lpop]] # First construct the dataset mpdta2 = mpdta |> transform( .Dtreat = year >= first.treat & first.treat != 0, lpop_dm = ave(lpop, first.treat, year, FUN = \\(x) x - mean(x, na.rm = TRUE)) ) # Then estimate the manual version of etwfe mod2 = fixest::feols( lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003) / lpop_dm | first.treat[lpop] + year[lpop], data = mpdta2, vcov = ~countyreal ) modelsummary( list(\"etwfe\" = mod, \"manual\" = mod2), gof_map = NA # drop all goodness-of-fit info for brevity ) library(marginaleffects) # Simple ATT slopes( mod2, newdata = subset(mpdta2, .Dtreat), # we only want rows where .Dtreat is TRUE variables = \".Dtreat\", by = \".Dtreat\" ) #> #> Term Contrast .Dtreat Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) TRUE -0.0506 0.0125 -4.05 <0.001 #> S 2.5 % 97.5 % #> 14.3 -0.0751 -0.0261 #> #> Columns: term, contrast, .Dtreat, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response # Event study slopes( mod2, newdata = transform(subset(mpdta2, .Dtreat), event = year - first.treat), variables = \".Dtreat\", by = \"event\" ) #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"regarding-fixed-effects","dir":"Articles","previous_headings":"Under the hood","what":"Regarding fixed effects","title":"Introduction to etwfe","text":"Let’s switch gears talk fixed effects quickly. regular fixest user, may noticed ’ve invoking varying slopes syntax fixed effect slot (.e., first.treat[lpop] year[lpop]). reason part practical, part philosophical. practical perspective, factor_var[numeric_var] equivalent base R’s factor_var/numeric_var “nesting” syntax much faster high-dimensional factors.5 philosophical perspective, etwfe tries limit amount extraneous information reports users. interaction effects ETWFE framework just acting controls. relegating fixed effects slot, can avoid polluting user’s console host extra coefficients. Nonetheless, can control behaviour fe (“fixed effects”) argument. Consider following options manual equivalents. ’ll leave pass models emfx confirm give correct aggregated treatment effects. can quickly demonstrate regression table return raw coefficients. final point note fixed effects etwfe defaults using group-level (.e., cohort-level) fixed effects like first.treat, rather unit-level fixed effects like countyreal. design decision reflects neat ancillary result Wooldridge (2021), proves equivalence two types fixed effects linear cases. Group-level effects virtue faster estimate, since fewer factor levels. Moreover, required nonlinear model families like Poisson per underlying ETWFE theory. Still, can specify unit-level fixed effects linear case ivar argument. , can easily confirm yields estimated treatment effects group-level default (although standard errors slightly different).","code":"# fe = \"feo\" (fixed effects only) mod_feo = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, fe = \"feo\" ) # ... which is equivalent to the manual regression mod_feo2 = fixest::feols( lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003) / lpop_dm + lpop + i(first.treat, lpop, ref = 0) + i(year, lpop, ref = 2003) | first.treat + year, data = mpdta2, vcov = ~countyreal ) # fe = \"none\" mod_none = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, fe = \"none\" ) # ... which is equivalent to the manual regression mod_none2 = fixest::feols( lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003) / lpop_dm + lpop + i(first.treat, lpop, ref = 0) + i(year, lpop, ref = 2003) + i(first.treat, ref = 0) + i(year, ref = 2003), data = mpdta2, vcov = ~countyreal ) mods = list( \"etwfe\" = mod, \"manual\" = mod2, \"etwfe (feo)\" = mod_feo, \"manual (feo)\" = mod_feo2, \"etwfe (none)\" = mod_none, \"manual (none)\" = mod_none2 ) modelsummary(mods, gof_map = NA) mod_es_i = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, ivar = countyreal # NEW: Use unit-level (county) FEs ) |> emfx(\"event\") modelsummary( list(\"Group-level FEs (default)\" = mod_es, \"Unit-level FEs\" = mod_es_i), shape = term:event:statistic ~ model, coef_rename = rename_fn, gof_omit = \"Adj|Within|IC|RMSE\" )"},{"path":"http://grantmcdermott.com/etwfe/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Grant McDermott. Author, maintainer. Frederic Kluser. Contributor.","code":""},{"path":"http://grantmcdermott.com/etwfe/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"McDermott G (2023). etwfe: Extended Two-Way Fixed Effects. R package version 0.3.5, https://grantmcdermott.com/etwfe/.","code":"@Manual{, title = {etwfe: Extended Two-Way Fixed Effects}, author = {Grant McDermott}, year = {2023}, note = {R package version 0.3.5}, url = {https://grantmcdermott.com/etwfe/}, }"},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"extended-two-way-fixed-effects-etwfe","dir":"","previous_headings":"","what":"Extended Two-Way Fixed Effects","title":"Extended Two-Way Fixed Effects","text":"goal etwfe estimate extended two-way fixed effects la Wooldridge (2021, 2022). Briefly, Wooldridge proposes set saturated interaction effects overcome potential bias problems vanilla TWFE difference--differences designs. Wooldridge solution intuitive elegant, rather tedious error prone code manually. etwfe package aims simplify process providing convenience functions work . Documentation available package homepage.","code":""},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Extended Two-Way Fixed Effects","text":"can install etwfe CRAN. , can grab development version R-universe.","code":"install.packages(\"etwfe\") install.packages(\"etwfe\", repos = \"https://grantmcdermott.r-universe.dev\")"},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"quickstart-example","dir":"","previous_headings":"","what":"Quickstart example","title":"Extended Two-Way Fixed Effects","text":"detailed walkthrough etwfe provided introductory vignette (available online, typing vignette(\"etwfe\") R console). ’s quickstart example demonstrate basic syntax.","code":"library(etwfe) # install.packages(\"did\") data(\"mpdta\", package = \"did\") head(mpdta) #> year countyreal lpop lemp first.treat treat #> 866 2003 8001 5.896761 8.461469 2007 1 #> 841 2004 8001 5.896761 8.336870 2007 1 #> 842 2005 8001 5.896761 8.340217 2007 1 #> 819 2006 8001 5.896761 8.378161 2007 1 #> 827 2007 8001 5.896761 8.487352 2007 1 #> 937 2003 8019 2.232377 4.997212 2007 1 # Estimate the model mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered) ) # This gives us a regression model with fully saturated interactions mod #> OLS estimation, Dep. Var.: lemp #> Observations: 2,500 #> Fixed-effects: first.treat: 4, year: 5 #> Varying slopes: lpop (first.treat: 4), lpop (year: 5) #> Standard-errors: Clustered (countyreal) #> Estimate Std. Error t value Pr(>|t|) #> .Dtreat:first.treat::2004:year::2004 -0.021248 0.021728 -0.977890 3.2860e-01 #> .Dtreat:first.treat::2004:year::2005 -0.081850 0.027375 -2.989963 2.9279e-03 ** #> .Dtreat:first.treat::2004:year::2006 -0.137870 0.030795 -4.477097 9.3851e-06 *** #> .Dtreat:first.treat::2004:year::2007 -0.109539 0.032322 -3.389024 7.5694e-04 *** #> .Dtreat:first.treat::2006:year::2006 0.002537 0.018883 0.134344 8.9318e-01 #> .Dtreat:first.treat::2006:year::2007 -0.045093 0.021987 -2.050907 4.0798e-02 * #> .Dtreat:first.treat::2007:year::2007 -0.045955 0.017975 -2.556568 1.0866e-02 * #> .Dtreat:first.treat::2004:year::2004:lpop_dm 0.004628 0.017584 0.263184 7.9252e-01 #> .Dtreat:first.treat::2004:year::2005:lpop_dm 0.025113 0.017904 1.402661 1.6134e-01 #> .Dtreat:first.treat::2004:year::2006:lpop_dm 0.050735 0.021070 2.407884 1.6407e-02 * #> .Dtreat:first.treat::2004:year::2007:lpop_dm 0.011250 0.026617 0.422648 6.7273e-01 #> .Dtreat:first.treat::2006:year::2006:lpop_dm 0.038935 0.016472 2.363731 1.8474e-02 * #> .Dtreat:first.treat::2006:year::2007:lpop_dm 0.038060 0.022477 1.693276 9.1027e-02 . #> .Dtreat:first.treat::2007:year::2007:lpop_dm -0.019835 0.016198 -1.224528 2.2133e-01 #> ... 10 variables were removed because of collinearity (.Dtreat:first.treat::2006:year::2004, .Dtreat:first.treat::2006:year::2005 and 8 others [full set in $collin.var]) #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 0.537131 Adj. R2: 0.87167 #> Within R2: 8.449e-4 # Pass to emfx() to recover the ATTs of interest. Here's an event-study example. emfx(mod, type = \"event\") #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 -0.0594 -0.00701 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 -0.0910 -0.02373 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 -0.1982 -0.07751 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"acknowledgements","dir":"","previous_headings":"","what":"Acknowledgements","title":"Extended Two-Way Fixed Effects","text":"Jeffrey Wooldridge underlying ETWFE theory. Laurent Bergé (fixest) Vincent Arel-Bundock (marginaleffects) maintaining two wonderful R packages heavy lifting hood . Fernando Rios-Avila JWDID Stata module, provided welcome foil unit testing whose elegant design helped inform choices R equivalent.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":null,"dir":"Reference","previous_headings":"","what":"Post-estimation treatment effects for an ETWFE regressions. — emfx","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"Post-estimation treatment effects ETWFE regressions.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"","code":"emfx( object, type = c(\"simple\", \"group\", \"calendar\", \"event\"), by_xvar = \"auto\", collapse = \"auto\", post_only = TRUE, ... )"},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"object `etwfe` model object. type Character. desired type post-estimation aggregation. by_xvar Logical. results account heterogeneous treatment effects? relevant preceding `etwfe` call included specified `xvar` argument, .e. interacted categorical covariate. default behaviour (\"auto\") automatically estimate heterogeneous treatment effects level `xvar` detected part underlying `etwfe` model object. Users can override setting either FALSE TRUE. See section Heterogeneous treatment effects . collapse Logical. Collapse data (period cohort) groups calculating marginal effects? trades loss estimate accuracy (typically around 1st 2nd significant decimal point) substantial improvement estimation time large datasets. default behaviour (\"auto\") automatically collapse original dataset 500,000 rows. Users can override setting either FALSE TRUE. Note collapsing group valid preceding `etwfe` call run \"ivar = NULL\" (default). See section Performance tips . post_only Logical. keep post-treatment effects. pre-treatment effects zero mechanical result ETWFE's estimation setup, default drop nuisance rows dataset. may want keep presentation reasons (e.g., plotting event-study); though warned strictly performative. argument evaluated `type = \"event\"`. ... Additional arguments passed [`marginaleffects::marginaleffects`]. example, can pass `vcov = FALSE` dramatically speed estimation times main marginal effects (cost getting information standard errors; see Performance tips ). Another potentially useful application testing whether heterogeneous treatment effects (.e. levels `xvar` covariate) equal invoking `hypothesis` argument, e.g. `hypothesis = \"b1 = b2\"`.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"`slopes` object `marginaleffects` package.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"performance-tips","dir":"Reference","previous_headings":"","what":"Performance tips","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"situations, `etwfe` complete quickly. part, `emfx` quite performant take seconds less datasets 100k rows. However, `emfx`'s computation time tend scale non-linearly size original data, well number interactions underlying `etwfe` model. Without getting deep weeds, numerical delta method used recover ATEs interest estimate two prediction models ** coefficient model compute standard errors. , potentially expensive operation can push computation time large datasets (> 1m rows) several minutes longer. Fortunately, two complementary strategies can use speed things . first turn expensive part whole procedure---standard error calculation---calling `emfx(..., vcov = FALSE)`. bring estimation time back seconds less, even datasets excess million rows. loss standard errors might acceptable trade-projects statistical inference critical, good news first strategy can still combined second strategy. turns collapsing data groups prior estimating marginal effects can yield substantial speed gains . Users can invoking `emfx(..., collapse = TRUE)` argument. effect dramatic first strategy, second strategy virtue retaining information standard errors. trade- time, however, collapsing data lead loss accuracy estimated parameters. hand, testing suggests loss accuracy tends relatively minor, results equivalent 1st 2nd significant decimal place (even better). Summarizing, quick plan attack try worried estimation time large datasets models: 0. Estimate `mod = etwfe(...)` per usual. 1. Run `emfx(mod, vcov = FALSE, ...)`. 2. Run `emfx(mod, vcov = FALSE, collapse = TRUE, ...)`. 3. Compare point estimates steps 1 2. similar enough satisfaction, get approximate standard errors running `emfx(mod, collapse = TRUE, ...)`.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"heterogeneous-treatment-effects","dir":"Reference","previous_headings":"","what":"Heterogeneous treatment effects","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"Specifying `etwfe(..., xvar = )` generate interaction effects levels `` part main regression model. reason useful (opposed regular, non-interacted covariate formula RHS) allows us estimate heterogeneous treatment effects part larger ETWFE framework. Specifically, can recover heterogeneous treatment effects level `` passing resulting `etwfe` model object `emfx()`. example, imagine categorical variable called \"age\" dataset, two distinct levels \"adult\" \"child\". Running `emfx(etwfe(..., xvar = age))` tell us efficacy treatment varies across adults children. can also leverage -built hypothesis testing infrastructure `marginaleffects` test whether treatment effect statistically different across two age groups; see Examples . Note principles carry categorical variables multiple levels, even continuous variables (although continuous variables well supported yet).","code":""},{"path":[]},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"","code":"# \\dontrun{ # We’ll use the mpdta dataset from the did package (which you’ll need to # install separately). # install.packages(\"did\") data(\"mpdta\", package = \"did\") # # Basic example # # The basic ETWFE workflow involves two steps: # 1) Estimate the main regression model with etwfe(). mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls (use 0 or 1 if none) tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered by county) ) # mod ## A fixest model object with fully saturated interaction effects. # 2) Recover the treatment effects of interest with emfx(). emfx(mod, type = \"event\") # dynamic ATE a la an event study #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # Etc. Other aggregation type options are \"simple\" (the default), \"group\" # and \"calendar\" # # Heterogeneous treatment effects # # Example where we estimate heterogeneous treatment effects for counties # within the 8 US Great Lake states (versus all other counties). gls = c(\"IL\" = 17, \"IN\" = 18, \"MI\" = 26, \"MN\" = 27, \"NY\" = 36, \"OH\" = 39, \"PA\" = 42, \"WI\" = 55) mpdta$gls = substr(mpdta$countyreal, 1, 2) %in% gls hmod = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, xvar = gls ## <= het. TEs by gls ) # Heterogeneous ATEs (could also specify \"event\", etc.) emfx(hmod) #> #> Term Contrast .Dtreat gls Estimate Std. Error z #> .Dtreat mean(TRUE) - mean(FALSE) TRUE FALSE -0.0637 0.0376 -1.69 #> .Dtreat mean(TRUE) - mean(FALSE) TRUE TRUE -0.0472 0.0271 -1.74 #> Pr(>|z|) S 2.5 % 97.5 % #> 0.0906 3.5 -0.137 0.01007 #> 0.0817 3.6 -0.100 0.00594 #> #> Columns: term, contrast, .Dtreat, gls, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # To test whether the ATEs across these two groups (non-GLS vs GLS) are # statistically different, simply pass an appropriate \"hypothesis\" argument. emfx(hmod, hypothesis = \"b1 = b2\") #> #> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> b1=b2 -0.0164 0.0559 -0.294 0.769 0.4 -0.126 0.093 #> #> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high #> Type: response #> # # Nonlinear model (distribution / link) families # # Poisson example mpdta$emp = exp(mpdta$lemp) etwfe( emp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, family = \"poisson\" ## <= family arg for nonlinear options ) |> emfx(\"event\") #> The variables '.Dtreat:first.treat::2006:year::2004', '.Dtreat:first.treat::2006:year::2005' and eight others have been removed because of collinearity (see $collin.var). #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) 0 -25.35 15.9 -1.5942 0.111 #> .Dtreat mean(TRUE) - mean(FALSE) 1 1.09 41.8 0.0261 0.979 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -75.12 22.3 -3.3696 <0.001 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -101.82 28.1 -3.6234 <0.001 #> S 2.5 % 97.5 % #> 3.2 -56.5 5.82 #> 0.0 -80.9 83.09 #> 10.4 -118.8 -31.43 #> 11.7 -156.9 -46.75 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # }"},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":null,"dir":"Reference","previous_headings":"","what":"Extended two-way fixed effects — etwfe","title":"Extended two-way fixed effects — etwfe","text":"Extended two-way fixed effects","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extended two-way fixed effects — etwfe","text":"","code":"etwfe( fml = NULL, tvar = NULL, gvar = NULL, data = NULL, ivar = NULL, xvar = NULL, tref = NULL, gref = NULL, cgroup = c(\"notyet\", \"never\"), fe = c(\"vs\", \"feo\", \"none\"), family = NULL, ... )"},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extended two-way fixed effects — etwfe","text":"fml two-side formula representing outcome (lhs) control variables (rhs), e.g. `y ~ x1 + x2`. controls required, rhs must take value 0 1, e.g. `y ~ 0`. tvar Time variable. Can string (e.g., \"year\") expression (e.g., year). gvar Group variable. Can either string (e.g., \"first_treated\") expression (e.g., first_treated). staggered treatment setting, group variable typically denotes treatment cohort. data data frame want run ETWFE . ivar Optional index variable. Can string (e.g., \"country\") expression (e.g., country). Leaving NULL (default) result group-level fixed effects used, efficient necessary nonlinear models (see `family` argument ). However, may still want cluster standard errors index variable `vcov` argument. See Examples . xvar Optional interacted categorical covariate estimating heterogeneous treatment effects. Enables recovery marginal treatment effect distinct levels `xvar`, e.g. \"child\", \"teenager\", \"adult\". Note \"x\" prefix \"xvar\" represents covariate *interacted* treatment, opposed regular control variable. tref Optional reference value `tvar`. Defaults minimum value (.e., first time period observed dataset). gref Optional reference value `gvar`. need provide `gvar` variable well specified. providing explicit reference value can useful/necessary desired control group takes unusual value. cgroup control group wish use estimating treatment effects. Either \"notyet\" treated (default) \"never\" treated. fe level fixed effects used? Defaults \"vs\" (varying slopes), efficient terms estimation terseness return model object. two options, \"feo\" (fixed effects ) \"none\" (fixed effects whatsoever), trade efficiency additional information (nuisance) model parameters. Note primary treatment parameters interest remain unchanged regardless choice. family [`family`] use estimation. Defaults NULL, case [`fixest::feols`] used. Otherwise passed [`fixest::feglm`], valid entries include \"logit\", \"poisson\", \"negbin\". Note non-NULL family entry detected, `ivar` automatically set NULL. ... Additional arguments passed [`fixest::feols`] ([`fixest::feglm`]). common example `vcov` argument.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extended two-way fixed effects — etwfe","text":"fixest object fully saturated interaction effects.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"heterogeneous-treatment-effects","dir":"Reference","previous_headings":"","what":"Heterogeneous treatment effects","title":"Extended two-way fixed effects — etwfe","text":"Specifying `etwfe(..., xvar = )` generate interaction effects levels `` part main regression model. reason useful (opposed regular, non-interacted covariate formula RHS) allows us estimate heterogeneous treatment effects part larger ETWFE framework. Specifically, can recover heterogeneous treatment effects level `` passing resulting `etwfe` model object `emfx()`. example, imagine categorical variable called \"age\" dataset, two distinct levels \"adult\" \"child\". Running `emfx(etwfe(..., xvar = age))` tell us efficacy treatment varies across adults children. can also leverage -built hypothesis testing infrastructure `marginaleffects` test whether treatment effect statistically different across two age groups; see Examples . Note principles carry categorical variables multiple levels, even continuous variables (although continuous variables well supported yet).","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"performance-tips","dir":"Reference","previous_headings":"","what":"Performance tips","title":"Extended two-way fixed effects — etwfe","text":"situations, `etwfe` complete quickly. part, `emfx` quite performant take seconds less datasets 100k rows. However, `emfx`'s computation time tend scale non-linearly size original data, well number interactions underlying `etwfe` model. Without getting deep weeds, numerical delta method used recover ATEs interest estimate two prediction models ** coefficient model compute standard errors. , potentially expensive operation can push computation time large datasets (> 1m rows) several minutes longer. Fortunately, two complementary strategies can use speed things . first turn expensive part whole procedure---standard error calculation---calling `emfx(..., vcov = FALSE)`. bring estimation time back seconds less, even datasets excess million rows. loss standard errors might acceptable trade-projects statistical inference critical, good news first strategy can still combined second strategy. turns collapsing data groups prior estimating marginal effects can yield substantial speed gains . Users can invoking `emfx(..., collapse = TRUE)` argument. effect dramatic first strategy, second strategy virtue retaining information standard errors. trade- time, however, collapsing data lead loss accuracy estimated parameters. hand, testing suggests loss accuracy tends relatively minor, results equivalent 1st 2nd significant decimal place (even better). Summarizing, quick plan attack try worried estimation time large datasets models: 0. Estimate `mod = etwfe(...)` per usual. 1. Run `emfx(mod, vcov = FALSE, ...)`. 2. Run `emfx(mod, vcov = FALSE, collapse = TRUE, ...)`. 3. Compare point estimates steps 1 2. similar enough satisfaction, get approximate standard errors running `emfx(mod, collapse = TRUE, ...)`.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Extended two-way fixed effects — etwfe","text":"Wooldridge, Jeffrey M. (2021). Two-Way Fixed Effects, Two-Way Mundlak Regression, Difference--Differences Estimators. Working paper (version: August 16, 2021). Available: http://dx.doi.org/10.2139/ssrn.3906345 Wooldridge, Jeffrey M. (2022). Simple Approaches Nonlinear Difference--Differences Panel Data. Econometrics Journal (forthcoming). Available: http://dx.doi.org/10.2139/ssrn.4183726","code":""},{"path":[]},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extended two-way fixed effects — etwfe","text":"","code":"# \\dontrun{ # We’ll use the mpdta dataset from the did package (which you’ll need to # install separately). # install.packages(\"did\") data(\"mpdta\", package = \"did\") # # Basic example # # The basic ETWFE workflow involves two steps: # 1) Estimate the main regression model with etwfe(). mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls (use 0 or 1 if none) tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered by county) ) # mod ## A fixest model object with fully saturated interaction effects. # 2) Recover the treatment effects of interest with emfx(). emfx(mod, type = \"event\") # dynamic ATE a la an event study #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # Etc. Other aggregation type options are \"simple\" (the default), \"group\" # and \"calendar\" # # Heterogeneous treatment effects # # Example where we estimate heterogeneous treatment effects for counties # within the 8 US Great Lake states (versus all other counties). gls = c(\"IL\" = 17, \"IN\" = 18, \"MI\" = 26, \"MN\" = 27, \"NY\" = 36, \"OH\" = 39, \"PA\" = 42, \"WI\" = 55) mpdta$gls = substr(mpdta$countyreal, 1, 2) %in% gls hmod = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, xvar = gls ## <= het. TEs by gls ) # Heterogeneous ATEs (could also specify \"event\", etc.) emfx(hmod) #> #> Term Contrast .Dtreat gls Estimate Std. Error z #> .Dtreat mean(TRUE) - mean(FALSE) TRUE FALSE -0.0637 0.0376 -1.69 #> .Dtreat mean(TRUE) - mean(FALSE) TRUE TRUE -0.0472 0.0271 -1.74 #> Pr(>|z|) S 2.5 % 97.5 % #> 0.0906 3.5 -0.137 0.01007 #> 0.0817 3.6 -0.100 0.00594 #> #> Columns: term, contrast, .Dtreat, gls, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # To test whether the ATEs across these two groups (non-GLS vs GLS) are # statistically different, simply pass an appropriate \"hypothesis\" argument. emfx(hmod, hypothesis = \"b1 = b2\") #> #> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> b1=b2 -0.0164 0.0559 -0.294 0.769 0.4 -0.126 0.093 #> #> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high #> Type: response #> # # Nonlinear model (distribution / link) families # # Poisson example mpdta$emp = exp(mpdta$lemp) etwfe( emp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, family = \"poisson\" ## <= family arg for nonlinear options ) |> emfx(\"event\") #> The variables '.Dtreat:first.treat::2006:year::2004', '.Dtreat:first.treat::2006:year::2005' and eight others have been removed because of collinearity (see $collin.var). #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) 0 -25.35 15.9 -1.5942 0.111 #> .Dtreat mean(TRUE) - mean(FALSE) 1 1.09 41.8 0.0261 0.979 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -75.12 22.3 -3.3696 <0.001 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -101.82 28.1 -3.6234 <0.001 #> S 2.5 % 97.5 % #> 3.2 -56.5 5.82 #> 0.0 -80.9 83.09 #> 10.4 -118.8 -31.43 #> 11.7 -156.9 -46.75 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # }"},{"path":[]},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-5","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.5","text":"Update tests match upstream changes fixest. Update maintainer email address.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-034","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.4","title":"etwfe 0.3.4","text":"CRAN release: 2023-06-19","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-4","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.4","text":"Update tests match upstream changes marginaleffects (#36 @vincentarelbundock).","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-033","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.3","title":"etwfe 0.3.3","text":"CRAN release: 2023-05-27","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-3","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.3","text":"Rejigged internal tests following upstream changes marginaleffects (#35 @vincentarelbundock).","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-032","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.2","title":"etwfe 0.3.2","text":"CRAN release: 2023-05-02","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"bug-fixes-0-3-2","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"etwfe 0.3.2","text":"Fixed internal centering procedure handling multiple covariate levels (#30, #31). fixes impact main ATT estimates (.e., typical use package). may lead differences heterogeneous ATTs—.e., via xvar arg—incorrectly estimated cases. Thanks @PhilipCarthy flagging @frederickluser helpful discussions Fixed internal upstream bug causing model offsets error (#28, thanks @mariofiorini initial report several others helpful discussion). Note fix requires insight >= 0.19.1.8, development version time writing. information available : https://github.com/easystats/insight/pull/759","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-031","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.1","title":"etwfe 0.3.1","text":"CRAN release: 2023-02-28","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-1","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.1","text":"Minor updates internal code unit tests match forthcoming updates marginaleffects 0.10.0. latter update also brings notable performance improvements emfx().","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"other-0-3-1","dir":"Changelog","previous_headings":"","what":"Other","title":"etwfe 0.3.1","text":"documentation improvements. Examples wrapped \\dontrun avoid triggering CRAN NOTEs Windows exceeding 5 seconds execution time. Note package homepage still runs Examples users want inspect output online.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-030","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.0","title":"etwfe 0.3.0","text":"CRAN release: 2023-02-08","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"new-features-and-enhancements-0-3-0","dir":"Changelog","previous_headings":"","what":"New features and enhancements","title":"etwfe 0.3.0","text":"Support estimating heterogeneous treatment effects via new etwfe(..., xvar = argument (#16, thanks @frederickluser). Automatically extends emfx() via latter’s by_xvar argument (#21). details provided dedicated “Heterogeneous treatment effects” section vignette help documentation new emfx(..., collapse = TRUE) argument can substantially reduce estimation times large datasets (#19, thanks @frederickluser). performance boost trade loss estimate accuracy. testing suggests difference relatively minor typical use cases (.e., results equivalent 1st 2nd significant decimal place, sometimes even better). Please let us know find edge cases true. details available dedicated “Performance tips” section vignette help documentation, including advice combining collapsing emfx(..., vcov = FALSE) (yields even dramatic speed boost cost reporting standard errors). Users can now use 1 fml RHS indicate control variables part etwfe call, e.g. etwfe(y ~ 1, ...). provides second way indicating controls, alongside existing 0 option, e.g. etwfe(y ~ 0, ...)","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"bug-fixes-0-3-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"etwfe 0.3.0","text":"Internal code tests updated account upstream breaking changes marginaleffects 0.9.0 (#20, thanks @vincentarelbundock). user side, notable changes longer call summary() emfx objects pretty printing, (former) “dydx” column resulting object now named “estimate”. changes reflected updated documentation.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"other-0-3-0","dir":"Changelog","previous_headings":"","what":"Other","title":"etwfe 0.3.0","text":"Various documentation improvements. example, aforementioned sections Heterogeneous TEs Performance tips. also removed warnings use time-varying controls (#17). truth, can’t quite recall included warnings first place testing confirms appear pose problem ETWFE framework. Thanks Felix Pretis prompting revisit implicit restriction, including forwarding relevant correspondence Prof. Wooldridge. data.table added Imports thus becomes direct dependency. already indirect dependency marginaleffects. ’s now possible install development version package R-universe. Details provided README.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-020","dir":"Changelog","previous_headings":"","what":"etwfe 0.2.0","title":"etwfe 0.2.0","text":"CRAN release: 2023-01-11","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"bug-fixes-and-breaking-changes-0-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes and breaking changes","title":"etwfe 0.2.0","text":".Dtreat indicator variable created etwfe call now logical instead integer (#14). fix yields slightly different effect sizes emfx output applied non-linear model families (e.g., etwfe(..., family = \"poisson\"). reason now implicitly calling marginaleffects::comparisons hood rather marginaleffects::marginaleffects. Note main etwfe coefficients (family) unaffected, also true emfx applied linear model (.e., default). (optional) ivar argument etwfe() moved argument order list second position fifth (.e., data argument). means four required arguments function now occupy top positions, enable shorter, unnamed notation like etwfe(y ~ x, year, cohort, dat).","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"new-features-and-enhancements-0-2-0","dir":"Changelog","previous_headings":"","what":"New features and enhancements","title":"etwfe 0.2.0","text":"emfx now allows (time-invariant) interacted control variables fml RHS. emfx now post_only logical argument, may useful plotting aesthetics (inference). See example introductory vignette. Various improvements documentation (restructuring, fixed typos, etc.)","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-010","dir":"Changelog","previous_headings":"","what":"etwfe 0.1.0","title":"etwfe 0.1.0","text":"CRAN release: 2022-12-14 Initial release.","code":""}] +[{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"background","dir":"Articles","previous_headings":"","what":"Background","title":"Introduction to etwfe","text":"canonical research design social scientists -called “differences--differences” () design.1 classic 2x2 case (two units, two periods), simple interaction effect two dummy variables suffices recover treatment effect. base R might look something like: resulting coefficient Dtreated_unitTRUE:Dpost_treatmentTRUE interaction term represents treatment effect. Rather manually specify interaction term, practice researchers often use equivalent formulation known two-way fixed effects (TWFE). core idea TWFE can subsume interaction term previous code chunk adding unit time fixed effects. single treatment dummy can used capture effect treatment directly. TWFE regression base R might look follows: treatment effect now captured coefficient Dtreat dummy. TWFE shortcut especially nice complicated panel data settings multiple units multiple times periods. Speaking , prefer use dedicated fixed effects / panel data package like fixest, also estimate previous regression like : TWFE regressions easy run intuitive, long time everyone happy. good last. cottage industry clever research now demonstrates things quite simple. Among things, standard TWFE formulation can impose strange (negative) weighting conditions key parts estimation procedure. One implication risk high probability estimate bias presence staggered treatment rollouts, common real-life applications. Fortunately, just econometricians taking away one favourite tools, kind enough replace new ones. Among , proposed approach Wooldridge (2021, 2022) noteworthy. idea might paraphrased stating problem TWFE first place. Rather, ’s weren’t enough. Instead including single treatment × time interaction, Wooldridge recommends saturate model possible interactions treatment time variables, including treatment cohorts, well covariates. goes show approach actually draws equivalence different types estimators (pooled OLS, twoway Mundlak regression, etc.) ’s entirely clear call . Wooldridge refers general idea extended TWFE—, ETWFE—rather like package takes name. Wooldridge ETWFE solution intuitive elegant. also rather tedious error prone code manually. correctly specify possible interactions, demean control variables within groups, recover treatment effects interest via appropriate marginal effect aggregation. etwfe package aims simplify process providing convenience functions work .","code":"lm(y ~ Dtreated_unit * Dpost_treatment, data = somedata) lm(y ~ Dtreat + factor(id) + factor(period), data = somedata) library(fixest) feols(y ~ Dtreat | id + period, data = somedata)"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"Introduction to etwfe","text":"demonstrate core functionality etwfe, ’ll use mpdta dataset US teen employment package (’ll need install separately). “Treatment” dataset refers increase minimum wage rate. examples follow, goal estimate effect minimum wage treatment (treat) log teen employment (lemp). Notice panel ID county level (countyreal), treatment staggered across cohorts (first.treat) group counties treated time. addition staggered treatment effects, also observe log population (lpop) potential control variable.","code":"# install.packages(\"did\") data(\"mpdta\", package = \"did\") head(mpdta) #> year countyreal lpop lemp first.treat treat #> 866 2003 8001 5.896761 8.461469 2007 1 #> 841 2004 8001 5.896761 8.336870 2007 1 #> 842 2005 8001 5.896761 8.340217 2007 1 #> 819 2006 8001 5.896761 8.378161 2007 1 #> 827 2007 8001 5.896761 8.487352 2007 1 #> 937 2003 8019 2.232377 4.997212 2007 1"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"basic-usage","dir":"Articles","previous_headings":"","what":"Basic usage","title":"Introduction to etwfe","text":"Let’s load etwfe work basic functionality. ’ll see, core workflow package involves two consecutive function calls: 1) etwfe() 2) emfx().","code":""},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"etwfe","dir":"Articles","previous_headings":"Basic usage","what":"etwfe","title":"Introduction to etwfe","text":"Given package name, won’t surprise learn key estimating function etwfe(). ’s look example dataset. things say etwfe() argument choices function options, ’ll leave details aside bit later. Right now, just know arguments required except vcov (though generally recommend , since probably want cluster standard errors individual unit level). Let’s take look model object. etwfe() done underneath hood construct treatment dummy variable .Dtreat saturated together variables interest set multiway interaction terms.2 may noticed etwfe() call returns standard fixest object, since uses perform underlying estimation. associated methods functions fixest package thus compatible model object. example, plot raw regression coefficients fixest::coefplot(), print nice regression table fixest::etable(). However, raw coefficients etwfe() estimation particularly meaningful . Recall complex, multiway interaction terms probably hard interpret . insight leads us next key function…","code":"library(etwfe) mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered) ) mod #> OLS estimation, Dep. Var.: lemp #> Observations: 2,500 #> Fixed-effects: first.treat: 4, year: 5 #> Varying slopes: lpop (first.treat: 4), lpop (year: 5) #> Standard-errors: Clustered (countyreal) #> Estimate Std. Error t value #> .Dtreat:first.treat::2004:year::2004 -0.021248 0.021728 -0.977890 #> .Dtreat:first.treat::2004:year::2005 -0.081850 0.027375 -2.989963 #> .Dtreat:first.treat::2004:year::2006 -0.137870 0.030795 -4.477097 #> .Dtreat:first.treat::2004:year::2007 -0.109539 0.032322 -3.389024 #> .Dtreat:first.treat::2006:year::2006 0.002537 0.018883 0.134344 #> .Dtreat:first.treat::2006:year::2007 -0.045093 0.021987 -2.050907 #> .Dtreat:first.treat::2007:year::2007 -0.045955 0.017975 -2.556568 #> .Dtreat:first.treat::2004:year::2004:lpop_dm 0.004628 0.017584 0.263184 #> .Dtreat:first.treat::2004:year::2005:lpop_dm 0.025113 0.017904 1.402661 #> .Dtreat:first.treat::2004:year::2006:lpop_dm 0.050735 0.021070 2.407884 #> .Dtreat:first.treat::2004:year::2007:lpop_dm 0.011250 0.026617 0.422648 #> .Dtreat:first.treat::2006:year::2006:lpop_dm 0.038935 0.016472 2.363731 #> .Dtreat:first.treat::2006:year::2007:lpop_dm 0.038060 0.022477 1.693276 #> .Dtreat:first.treat::2007:year::2007:lpop_dm -0.019835 0.016198 -1.224528 #> Pr(>|t|) #> .Dtreat:first.treat::2004:year::2004 3.2860e-01 #> .Dtreat:first.treat::2004:year::2005 2.9279e-03 ** #> .Dtreat:first.treat::2004:year::2006 9.3851e-06 *** #> .Dtreat:first.treat::2004:year::2007 7.5694e-04 *** #> .Dtreat:first.treat::2006:year::2006 8.9318e-01 #> .Dtreat:first.treat::2006:year::2007 4.0798e-02 * #> .Dtreat:first.treat::2007:year::2007 1.0866e-02 * #> .Dtreat:first.treat::2004:year::2004:lpop_dm 7.9252e-01 #> .Dtreat:first.treat::2004:year::2005:lpop_dm 1.6134e-01 #> .Dtreat:first.treat::2004:year::2006:lpop_dm 1.6407e-02 * #> .Dtreat:first.treat::2004:year::2007:lpop_dm 6.7273e-01 #> .Dtreat:first.treat::2006:year::2006:lpop_dm 1.8474e-02 * #> .Dtreat:first.treat::2006:year::2007:lpop_dm 9.1027e-02 . #> .Dtreat:first.treat::2007:year::2007:lpop_dm 2.2133e-01 #> ... 10 variables were removed because of collinearity (.Dtreat:first.treat::2006:year::2004, .Dtreat:first.treat::2006:year::2005 and 8 others [full set in $collin.var]) #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 0.537131 Adj. R2: 0.87167 #> Within R2: 8.449e-4"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"emfx","dir":"Articles","previous_headings":"Basic usage","what":"emfx","title":"Introduction to etwfe","text":"raw etwfe coefficients aren’t particularly useful , can instead? Well, probably want aggregate along dimension interest (e.g., groups time, event study). natural way perform aggregations recovering appropriate marginal effects. etwfe package provides another convenience function : emfx(), thin(ish) wrapper around marginaleffects::slopes(). example, can recover average treatment effect treated (ATT) follows. words, model telling us increase minimum wage leads approximate 5 percent decrease teen employment. Beyond simple ATTs, emfx() also supports types aggregations via type argument. example, can use type = \"calendar\" get ATTs period, type = \"group\" get ATTs cohort groups. option probably useful people type = \"event\", recover dynamic treatment effects la event study. Let’s try save resulting object, since plan reuse moment. event study suggests teen disemployment effect minimum wage hike fairly modest first (3%), increases next years (>10%). next section, ’ll look ways communicate kind finding audience.","code":"emfx(mod) #> #> Term Contrast .Dtreat Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) TRUE -0.0506 0.0125 -4.05 <0.001 #> S 2.5 % 97.5 % #> 14.3 -0.0751 -0.0261 #> #> Columns: term, contrast, .Dtreat, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response mod_es = emfx(mod, type = \"event\") mod_es #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"presentation","dir":"Articles","previous_headings":"Basic usage","what":"Presentation","title":"Introduction to etwfe","text":"Since emfx() produces standard marginaleffects object, can pass supported methods packages. example, can pass modelsummary get nice regression table event study coefficients. Note use shape coef_rename arguments ; optional help make output look bit nicer. Event study visualization, can pass preferred plotting method. example: Note emfx reports post-treatment effects. pre-treatment effects swept estimation way ETWFE set . fact, pre-treatment effects mechanistically set zero. means ETWFE used interrogating pre-treatment fit (say, visual inspection parallel pre-trends). Still, can get zero pre-treatment effects changing post_only argument. emphasize strictly performative—, pre-treatment effects zero estimation design—might make event study plot aesthetically pleasing.","code":"library(modelsummary) # Quick renaming function to replace \".Dtreat\" with something more meaningful rename_fn = function(old_names) { new_names = gsub(\".Dtreat\", \"Years post treatment =\", old_names) setNames(new_names, old_names) } modelsummary( mod_es, shape = term:event:statistic ~ model, coef_rename = rename_fn, gof_omit = \"Adj|Within|IC|RMSE\", title = \"Event study\", notes = \"Std. errors are clustered at the county level\" ) library(ggplot2) theme_set( theme_minimal() + theme(panel.grid.minor = element_blank()) ) ggplot(mod_es, aes(x = event, y = estimate, ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0) + geom_pointrange(col = \"darkcyan\") + labs(x = \"Years post treatment\", y = \"Effect on log teen employment\") # Use post_only = FALSE to get the \"zero\" pre-treatment effects mod_es2 = emfx(mod, type = \"event\", post_only = FALSE) ggplot(mod_es2, aes(x = event, y = estimate, ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0) + geom_vline(xintercept = -1, lty = 2) + geom_pointrange(col = \"darkcyan\") + labs( x = \"Years post treatment\", y = \"Effect on log teen employment\", caption = \"Note: Zero pre-treatment effects for illustrative purposes only.\" ) #> Warning: Removed 4 rows containing missing values (`geom_segment()`)."},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"heterogeneous-treatment-effects","dir":"Articles","previous_headings":"","what":"Heterogeneous treatment effects","title":"Introduction to etwfe","text":"far ’ve limited homogeneous treatment effects, impact treatment (.e., minimum wage hike) averaged across US counties dataset. However, many research problems require us estimate treatment effects separately across groups , potentially, test differences . example, might want test whether efficacy new vaccine differs across age groups, whether marketing campaign equally successful across genders. ETWFE framework naturally lends kinds heterogeneous treatment effects. Consider following example, first create logical dummy variable US counties eight Great Lake states (GLS). Now imagine interested estimating separate treatment effects GLS versus non-GLS counties. simply invoking optional xvar argument part etwfe() call.3 subsequent emfx() call object automatically recognize want recover treatment effects two distinct groups. point estimates might tempt us think minimum wage hikes caused less teen disemployment GLS counties rest US average. However, test formally can invoke powerful hypothesis infrastructure underlying marginaleffects package. Probably easiest way using b[]-style positional arguments, “[]” denotes row emfx() return object. Thus, specifying hypothesis = \"b1 = b2\", can test whether ATTs row 1 (non-GLS) row 2 (GLS) different one another. see actually statistical difference average disemployment effect GLS non-GLS counties. One final aside can easily display results heterogeneous treatment effects plot table form. ’s example latter, make use modelsummary(..., shape = ...) argument. Comparing ATT GLS non-GLS counties simple example limited binary comparison group ATTs, note logic carries richer settings. can use exact workflow estimate heterogeneous treatment effects different aggregations (e.g., event studies) across groups many levels.","code":"gls_fips = c(\"IL\" = 17, \"IN\" = 18, \"MI\" = 26, \"MN\" = 27, \"NY\" = 36, \"OH\" = 39, \"PA\" = 42, \"WI\" = 55) mpdta$gls = substr(mpdta$countyreal, 1, 2) %in% gls_fips hmod = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, xvar = gls ## <= het. TEs by gls ) # Heterogeneous ATTs (could also specify `type = \"event\"`, etc.) emfx(hmod) #> #> Term Contrast .Dtreat gls Estimate Std. Error z #> .Dtreat mean(TRUE) - mean(FALSE) TRUE FALSE -0.0637 0.0376 -1.69 #> .Dtreat mean(TRUE) - mean(FALSE) TRUE TRUE -0.0472 0.0271 -1.74 #> Pr(>|z|) S 2.5 % 97.5 % #> 0.0906 3.5 -0.137 0.01007 #> 0.0817 3.6 -0.100 0.00594 #> #> Columns: term, contrast, .Dtreat, gls, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response emfx(hmod, hypothesis = \"b1 = b2\") #> #> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> b1=b2 -0.0164 0.0559 -0.294 0.769 0.4 -0.126 0.093 #> #> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high #> Type: response modelsummary( models = list(\"GLS county\" = emfx(hmod)), shape = term + statistic ~ model + gls, # add xvar variable (here: gls) coef_map = c(\".Dtreat\" = \"ATT\"), gof_map = NA, title = \"Comparing the ATT on GLS and non-GLS counties\" )"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"other-families","dir":"Articles","previous_headings":"","what":"Other families","title":"Introduction to etwfe","text":"Another key feature ETWFE approach—one sets apart advanced implementations extensions—supports nonlinear model (distribution / link) families. Users need simply invoke family argument. ’s brief example, recast earlier event-study Poisson regression.","code":"mpdta$emp = exp(mpdta$lemp) etwfe( emp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, family = \"poisson\" ) |> emfx(\"event\") #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) 0 -25.35 15.9 -1.5942 0.111 #> .Dtreat mean(TRUE) - mean(FALSE) 1 1.09 41.8 0.0261 0.979 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -75.12 22.3 -3.3696 <0.001 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -101.82 28.1 -3.6234 <0.001 #> S 2.5 % 97.5 % #> 3.2 -56.5 5.82 #> 0.0 -80.9 83.09 #> 10.4 -118.8 -31.43 #> 11.7 -156.9 -46.75 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"performance-tips","dir":"Articles","previous_headings":"","what":"Performance tips","title":"Introduction to etwfe","text":"Thinking etwfe workflow pair consecutive functional calls, first etwfe() stage tends fast. ’re leveraging incredible performance fixest also taking shortcuts avoid wasting time nuisance parameters. See Regarding fixed effects section details . part, second emfx() stage also tends pretty performant. data less 100k rows, ’s unlikely ’ll wait seconds obtain results. However, emfx’s computation time tend scale non-linearly size original data, well number interactions underlying etwfe model object. Without getting deep weeds, relying numerical delta method (excellent) marginaleffects package underneath hood recover ATTs interest. method requires estimating two prediction models coefficient model computing standard errors. ’s potentially expensive operation can push computation time large datasets (> 1m rows) several minutes longer. Fortunately, two complementary strategies can use speed things . first turn expensive part whole procedure—standard error calculation—calling emfx(..., vcov = FALSE). bring estimation time back seconds less, even datasets excess million rows. course, loss standard errors might acceptable trade-projects statistical inference critical. good news first strategy can still combined second strategy: turns collapsing data groups prior estimating marginal effects can yield substantial speed gains . Users can invoking emfx(..., collapse = TRUE) argument. effect dramatic first strategy, collapsing data virtue retaining information standard errors. trade-time, however, collapsing data lead loss accuracy estimated parameters. hand, testing suggests loss accuracy tends relatively minor, results equivalent 1st 2nd significant decimal place (even better). Summarizing, quick plan attack try worried estimation time large datasets models: Estimate mod = etwfe(...) per usual. Run emfx(mod, vcov = FALSE, ...). Run emfx(mod, vcov = FALSE, collapse = TRUE, ...). Compare point estimates steps 1 2. similar enough satisfaction, get approximate standard errors running emfx(mod, collapse = TRUE, ...). ’s bit performance art, since examples vignette complete quickly anyway. reworking earlier event study example demonstrate performance-conscious workflow. put fine point , can can compare original event study collapsed estimates see results indeed similar. Event study","code":"# Step 0 already complete: using the same `mod` object from earlier... # Step 1 emfx(mod, type = \"event\", vcov = FALSE) #> #> Term Contrast event Estimate #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 #> #> Columns: term, contrast, event, estimate, predicted_lo, predicted_hi, predicted #> Type: response # Step 2 emfx(mod, type = \"event\", vcov = FALSE, collapse = TRUE) #> #> Term Contrast event Estimate #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0216 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0635 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 #> #> Columns: term, contrast, event, estimate, predicted_lo, predicted_hi, predicted #> Type: response # Step 3: Results from 1 and 2 are similar enough, so get approx. SEs mod_es2 = emfx(mod, type = \"event\", collapse = TRUE) modelsummary( list(\"Original\" = mod_es, \"Collapsed\" = mod_es2), shape = term:event:statistic ~ model, coef_rename = rename_fn, gof_omit = \"Adj|Within|IC|RMSE\", title = \"Event study\", notes = \"Std. errors are clustered at the county level\" )"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"under-the-hood","dir":"Articles","previous_headings":"","what":"Under the hood","title":"Introduction to etwfe","text":"Now ’ve seen etwfe action, let’s circle back package hood. section isn’t necessary use package; feel free skip . review internal details help optimize different scenarios also give better understanding etwfe’s default choices.","code":""},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"manual-implementation","dir":"Articles","previous_headings":"Under the hood","what":"Manual implementation","title":"Introduction to etwfe","text":"keep reiterating, ETWFE approach basically involves saturating regression interaction effects. can easily grab formula estimated model see . point, however, may notice things. first formula references several variables aren’t original dataset. obvious one .Dtreat treatment dummy. subtle one lpop_dm, demeaned (.e., group-centered) version lpop control variable. control variables demeaned interacted ETWFE setting. ’s constructed dataset ahead time estimated ETWFE regression manually: can confirm manual approach yields output original etwfe regression. ’ll use modelsummary , since ’ve already loaded .4. transform raw coefficients meaningful ATT counterparts, just need perform appropriate marginal effects operation. example, ’s can get simple ATTs event-study ATTs earlier. emfx() behind scenes.","code":"mod$fml_all #> $linear #> lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003)/lpop_dm #> #> #> $fixef #> ~first.treat + first.treat[[lpop]] + year + year[[lpop]] # First construct the dataset mpdta2 = mpdta |> transform( .Dtreat = year >= first.treat & first.treat != 0, lpop_dm = ave(lpop, first.treat, year, FUN = \\(x) x - mean(x, na.rm = TRUE)) ) # Then estimate the manual version of etwfe mod2 = fixest::feols( lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003) / lpop_dm | first.treat[lpop] + year[lpop], data = mpdta2, vcov = ~countyreal ) modelsummary( list(\"etwfe\" = mod, \"manual\" = mod2), gof_map = NA # drop all goodness-of-fit info for brevity ) library(marginaleffects) # Simple ATT slopes( mod2, newdata = subset(mpdta2, .Dtreat), # we only want rows where .Dtreat is TRUE variables = \".Dtreat\", by = \".Dtreat\" ) #> #> Term Contrast .Dtreat Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) TRUE -0.0506 0.0125 -4.05 <0.001 #> S 2.5 % 97.5 % #> 14.3 -0.0751 -0.0261 #> #> Columns: term, contrast, .Dtreat, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response # Event study slopes( mod2, newdata = transform(subset(mpdta2, .Dtreat), event = year - first.treat), variables = \".Dtreat\", by = \"event\" ) #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/articles/etwfe.html","id":"regarding-fixed-effects","dir":"Articles","previous_headings":"Under the hood","what":"Regarding fixed effects","title":"Introduction to etwfe","text":"Let’s switch gears talk fixed effects quickly. regular fixest user, may noticed ’ve invoking varying slopes syntax fixed effect slot (.e., first.treat[lpop] year[lpop]). reason part practical, part philosophical. practical perspective, factor_var[numeric_var] equivalent base R’s factor_var/numeric_var “nesting” syntax much faster high-dimensional factors.5 philosophical perspective, etwfe tries limit amount extraneous information reports users. interaction effects ETWFE framework just acting controls. relegating fixed effects slot, can avoid polluting user’s console host extra coefficients. Nonetheless, can control behaviour fe (“fixed effects”) argument. Consider following options manual equivalents. ’ll leave pass models emfx confirm give correct aggregated treatment effects. can quickly demonstrate regression table return raw coefficients. final point note fixed effects etwfe defaults using group-level (.e., cohort-level) fixed effects like first.treat, rather unit-level fixed effects like countyreal. design decision reflects neat ancillary result Wooldridge (2021), proves equivalence two types fixed effects linear cases. Group-level effects virtue faster estimate, since fewer factor levels. Moreover, required nonlinear model families like Poisson per underlying ETWFE theory. Still, can specify unit-level fixed effects linear case ivar argument. , can easily confirm yields estimated treatment effects group-level default (although standard errors slightly different).","code":"# fe = \"feo\" (fixed effects only) mod_feo = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, fe = \"feo\" ) # ... which is equivalent to the manual regression mod_feo2 = fixest::feols( lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003) / lpop_dm + lpop + i(first.treat, lpop, ref = 0) + i(year, lpop, ref = 2003) | first.treat + year, data = mpdta2, vcov = ~countyreal ) # fe = \"none\" mod_none = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, fe = \"none\" ) # ... which is equivalent to the manual regression mod_none2 = fixest::feols( lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003) / lpop_dm + lpop + i(first.treat, lpop, ref = 0) + i(year, lpop, ref = 2003) + i(first.treat, ref = 0) + i(year, ref = 2003), data = mpdta2, vcov = ~countyreal ) mods = list( \"etwfe\" = mod, \"manual\" = mod2, \"etwfe (feo)\" = mod_feo, \"manual (feo)\" = mod_feo2, \"etwfe (none)\" = mod_none, \"manual (none)\" = mod_none2 ) modelsummary(mods, gof_map = NA) mod_es_i = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, ivar = countyreal # NEW: Use unit-level (county) FEs ) |> emfx(\"event\") modelsummary( list(\"Group-level FEs (default)\" = mod_es, \"Unit-level FEs\" = mod_es_i), shape = term:event:statistic ~ model, coef_rename = rename_fn, gof_omit = \"Adj|Within|IC|RMSE\" )"},{"path":"http://grantmcdermott.com/etwfe/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Grant McDermott. Author, maintainer. Frederic Kluser. Contributor.","code":""},{"path":"http://grantmcdermott.com/etwfe/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"McDermott G (2023). etwfe: Extended Two-Way Fixed Effects. R package version 0.3.5, https://grantmcdermott.com/etwfe/.","code":"@Manual{, title = {etwfe: Extended Two-Way Fixed Effects}, author = {Grant McDermott}, year = {2023}, note = {R package version 0.3.5}, url = {https://grantmcdermott.com/etwfe/}, }"},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"extended-two-way-fixed-effects-etwfe","dir":"","previous_headings":"","what":"Extended Two-Way Fixed Effects","title":"Extended Two-Way Fixed Effects","text":"goal etwfe estimate extended two-way fixed effects la Wooldridge (2021, 2022). Briefly, Wooldridge proposes set saturated interaction effects overcome potential bias problems vanilla TWFE difference--differences designs. Wooldridge solution intuitive elegant, rather tedious error prone code manually. etwfe package aims simplify process providing convenience functions work . Documentation available package homepage.","code":""},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Extended Two-Way Fixed Effects","text":"can install etwfe CRAN. , can grab development version R-universe.","code":"install.packages(\"etwfe\") install.packages(\"etwfe\", repos = \"https://grantmcdermott.r-universe.dev\")"},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"quickstart-example","dir":"","previous_headings":"","what":"Quickstart example","title":"Extended Two-Way Fixed Effects","text":"detailed walkthrough etwfe provided introductory vignette (available online, typing vignette(\"etwfe\") R console). ’s quickstart example demonstrate basic syntax.","code":"library(etwfe) # install.packages(\"did\") data(\"mpdta\", package = \"did\") head(mpdta) #> year countyreal lpop lemp first.treat treat #> 866 2003 8001 5.896761 8.461469 2007 1 #> 841 2004 8001 5.896761 8.336870 2007 1 #> 842 2005 8001 5.896761 8.340217 2007 1 #> 819 2006 8001 5.896761 8.378161 2007 1 #> 827 2007 8001 5.896761 8.487352 2007 1 #> 937 2003 8019 2.232377 4.997212 2007 1 # Estimate the model mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered) ) # This gives us a regression model with fully saturated interactions mod #> OLS estimation, Dep. Var.: lemp #> Observations: 2,500 #> Fixed-effects: first.treat: 4, year: 5 #> Varying slopes: lpop (first.treat: 4), lpop (year: 5) #> Standard-errors: Clustered (countyreal) #> Estimate Std. Error t value Pr(>|t|) #> .Dtreat:first.treat::2004:year::2004 -0.021248 0.021728 -0.977890 3.2860e-01 #> .Dtreat:first.treat::2004:year::2005 -0.081850 0.027375 -2.989963 2.9279e-03 ** #> .Dtreat:first.treat::2004:year::2006 -0.137870 0.030795 -4.477097 9.3851e-06 *** #> .Dtreat:first.treat::2004:year::2007 -0.109539 0.032322 -3.389024 7.5694e-04 *** #> .Dtreat:first.treat::2006:year::2006 0.002537 0.018883 0.134344 8.9318e-01 #> .Dtreat:first.treat::2006:year::2007 -0.045093 0.021987 -2.050907 4.0798e-02 * #> .Dtreat:first.treat::2007:year::2007 -0.045955 0.017975 -2.556568 1.0866e-02 * #> .Dtreat:first.treat::2004:year::2004:lpop_dm 0.004628 0.017584 0.263184 7.9252e-01 #> .Dtreat:first.treat::2004:year::2005:lpop_dm 0.025113 0.017904 1.402661 1.6134e-01 #> .Dtreat:first.treat::2004:year::2006:lpop_dm 0.050735 0.021070 2.407884 1.6407e-02 * #> .Dtreat:first.treat::2004:year::2007:lpop_dm 0.011250 0.026617 0.422648 6.7273e-01 #> .Dtreat:first.treat::2006:year::2006:lpop_dm 0.038935 0.016472 2.363731 1.8474e-02 * #> .Dtreat:first.treat::2006:year::2007:lpop_dm 0.038060 0.022477 1.693276 9.1027e-02 . #> .Dtreat:first.treat::2007:year::2007:lpop_dm -0.019835 0.016198 -1.224528 2.2133e-01 #> ... 10 variables were removed because of collinearity (.Dtreat:first.treat::2006:year::2004, .Dtreat:first.treat::2006:year::2005 and 8 others [full set in $collin.var]) #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 0.537131 Adj. R2: 0.87167 #> Within R2: 8.449e-4 # Pass to emfx() to recover the ATTs of interest. Here's an event-study example. emfx(mod, type = \"event\") #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 -0.0594 -0.00701 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 -0.0910 -0.02373 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 -0.1982 -0.07751 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response"},{"path":"http://grantmcdermott.com/etwfe/index.html","id":"acknowledgements","dir":"","previous_headings":"","what":"Acknowledgements","title":"Extended Two-Way Fixed Effects","text":"Jeffrey Wooldridge underlying ETWFE theory. Laurent Bergé (fixest) Vincent Arel-Bundock (marginaleffects) maintaining two wonderful R packages heavy lifting hood . Fernando Rios-Avila JWDID Stata module, provided welcome foil unit testing whose elegant design helped inform choices R equivalent.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":null,"dir":"Reference","previous_headings":"","what":"Post-estimation treatment effects for an ETWFE regressions. — emfx","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"Post-estimation treatment effects ETWFE regressions.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"","code":"emfx( object, type = c(\"simple\", \"group\", \"calendar\", \"event\"), by_xvar = \"auto\", collapse = \"auto\", post_only = TRUE, ... )"},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"object `etwfe` model object. type Character. desired type post-estimation aggregation. by_xvar Logical. results account heterogeneous treatment effects? relevant preceding `etwfe` call included specified `xvar` argument, .e. interacted categorical covariate. default behaviour (\"auto\") automatically estimate heterogeneous treatment effects level `xvar` detected part underlying `etwfe` model object. Users can override setting either FALSE TRUE. See section Heterogeneous treatment effects . collapse Logical. Collapse data (period cohort) groups calculating marginal effects? trades loss estimate accuracy (typically around 1st 2nd significant decimal point) substantial improvement estimation time large datasets. default behaviour (\"auto\") automatically collapse original dataset 500,000 rows. Users can override setting either FALSE TRUE. Note collapsing group valid preceding `etwfe` call run \"ivar = NULL\" (default). See section Performance tips . post_only Logical. keep post-treatment effects. pre-treatment effects zero mechanical result ETWFE's estimation setup, default drop nuisance rows dataset. may want keep presentation reasons (e.g., plotting event-study); though warned strictly performative. argument evaluated `type = \"event\"`. ... Additional arguments passed [`marginaleffects::marginaleffects`]. example, can pass `vcov = FALSE` dramatically speed estimation times main marginal effects (cost getting information standard errors; see Performance tips ). Another potentially useful application testing whether heterogeneous treatment effects (.e. levels `xvar` covariate) equal invoking `hypothesis` argument, e.g. `hypothesis = \"b1 = b2\"`.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"`slopes` object `marginaleffects` package.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"performance-tips","dir":"Reference","previous_headings":"","what":"Performance tips","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"situations, `etwfe` complete quickly. part, `emfx` quite performant take seconds less datasets 100k rows. However, `emfx`'s computation time tend scale non-linearly size original data, well number interactions underlying `etwfe` model. Without getting deep weeds, numerical delta method used recover ATEs interest estimate two prediction models ** coefficient model compute standard errors. , potentially expensive operation can push computation time large datasets (> 1m rows) several minutes longer. Fortunately, two complementary strategies can use speed things . first turn expensive part whole procedure---standard error calculation---calling `emfx(..., vcov = FALSE)`. bring estimation time back seconds less, even datasets excess million rows. loss standard errors might acceptable trade-projects statistical inference critical, good news first strategy can still combined second strategy. turns collapsing data groups prior estimating marginal effects can yield substantial speed gains . Users can invoking `emfx(..., collapse = TRUE)` argument. effect dramatic first strategy, second strategy virtue retaining information standard errors. trade- time, however, collapsing data lead loss accuracy estimated parameters. hand, testing suggests loss accuracy tends relatively minor, results equivalent 1st 2nd significant decimal place (even better). Summarizing, quick plan attack try worried estimation time large datasets models: 0. Estimate `mod = etwfe(...)` per usual. 1. Run `emfx(mod, vcov = FALSE, ...)`. 2. Run `emfx(mod, vcov = FALSE, collapse = TRUE, ...)`. 3. Compare point estimates steps 1 2. similar enough satisfaction, get approximate standard errors running `emfx(mod, collapse = TRUE, ...)`.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"heterogeneous-treatment-effects","dir":"Reference","previous_headings":"","what":"Heterogeneous treatment effects","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"Specifying `etwfe(..., xvar = )` generate interaction effects levels `` part main regression model. reason useful (opposed regular, non-interacted covariate formula RHS) allows us estimate heterogeneous treatment effects part larger ETWFE framework. Specifically, can recover heterogeneous treatment effects level `` passing resulting `etwfe` model object `emfx()`. example, imagine categorical variable called \"age\" dataset, two distinct levels \"adult\" \"child\". Running `emfx(etwfe(..., xvar = age))` tell us efficacy treatment varies across adults children. can also leverage -built hypothesis testing infrastructure `marginaleffects` test whether treatment effect statistically different across two age groups; see Examples . Note principles carry categorical variables multiple levels, even continuous variables (although continuous variables well supported yet).","code":""},{"path":[]},{"path":"http://grantmcdermott.com/etwfe/reference/emfx.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Post-estimation treatment effects for an ETWFE regressions. — emfx","text":"","code":"# \\dontrun{ # We’ll use the mpdta dataset from the did package (which you’ll need to # install separately). # install.packages(\"did\") data(\"mpdta\", package = \"did\") # # Basic example # # The basic ETWFE workflow involves two steps: # 1) Estimate the main regression model with etwfe(). mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls (use 0 or 1 if none) tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered by county) ) # mod ## A fixest model object with fully saturated interaction effects. # 2) Recover the treatment effects of interest with emfx(). emfx(mod, type = \"event\") # dynamic ATE a la an event study #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # Etc. Other aggregation type options are \"simple\" (the default), \"group\" # and \"calendar\" # # Heterogeneous treatment effects # # Example where we estimate heterogeneous treatment effects for counties # within the 8 US Great Lake states (versus all other counties). gls = c(\"IL\" = 17, \"IN\" = 18, \"MI\" = 26, \"MN\" = 27, \"NY\" = 36, \"OH\" = 39, \"PA\" = 42, \"WI\" = 55) mpdta$gls = substr(mpdta$countyreal, 1, 2) %in% gls hmod = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, xvar = gls ## <= het. TEs by gls ) # Heterogeneous ATEs (could also specify \"event\", etc.) emfx(hmod) #> #> Term Contrast .Dtreat gls Estimate Std. Error z #> .Dtreat mean(TRUE) - mean(FALSE) TRUE FALSE -0.0637 0.0376 -1.69 #> .Dtreat mean(TRUE) - mean(FALSE) TRUE TRUE -0.0472 0.0271 -1.74 #> Pr(>|z|) S 2.5 % 97.5 % #> 0.0906 3.5 -0.137 0.01007 #> 0.0817 3.6 -0.100 0.00594 #> #> Columns: term, contrast, .Dtreat, gls, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # To test whether the ATEs across these two groups (non-GLS vs GLS) are # statistically different, simply pass an appropriate \"hypothesis\" argument. emfx(hmod, hypothesis = \"b1 = b2\") #> #> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> b1=b2 -0.0164 0.0559 -0.294 0.769 0.4 -0.126 0.093 #> #> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high #> Type: response #> # # Nonlinear model (distribution / link) families # # Poisson example mpdta$emp = exp(mpdta$lemp) etwfe( emp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, family = \"poisson\" ## <= family arg for nonlinear options ) |> emfx(\"event\") #> The variables '.Dtreat:first.treat::2006:year::2004', '.Dtreat:first.treat::2006:year::2005' and eight others have been removed because of collinearity (see $collin.var). #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) 0 -25.35 15.9 -1.5942 0.111 #> .Dtreat mean(TRUE) - mean(FALSE) 1 1.09 41.8 0.0261 0.979 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -75.12 22.3 -3.3696 <0.001 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -101.82 28.1 -3.6234 <0.001 #> S 2.5 % 97.5 % #> 3.2 -56.5 5.82 #> 0.0 -80.9 83.09 #> 10.4 -118.8 -31.43 #> 11.7 -156.9 -46.75 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # }"},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":null,"dir":"Reference","previous_headings":"","what":"Extended two-way fixed effects — etwfe","title":"Extended two-way fixed effects — etwfe","text":"Extended two-way fixed effects","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extended two-way fixed effects — etwfe","text":"","code":"etwfe( fml = NULL, tvar = NULL, gvar = NULL, data = NULL, ivar = NULL, xvar = NULL, tref = NULL, gref = NULL, cgroup = c(\"notyet\", \"never\"), fe = c(\"vs\", \"feo\", \"none\"), family = NULL, ... )"},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extended two-way fixed effects — etwfe","text":"fml two-side formula representing outcome (lhs) control variables (rhs), e.g. `y ~ x1 + x2`. controls required, rhs must take value 0 1, e.g. `y ~ 0`. tvar Time variable. Can string (e.g., \"year\") expression (e.g., year). gvar Group variable. Can either string (e.g., \"first_treated\") expression (e.g., first_treated). staggered treatment setting, group variable typically denotes treatment cohort. data data frame want run ETWFE . ivar Optional index variable. Can string (e.g., \"country\") expression (e.g., country). Leaving NULL (default) result group-level fixed effects used, efficient necessary nonlinear models (see `family` argument ). However, may still want cluster standard errors index variable `vcov` argument. See Examples . xvar Optional interacted categorical covariate estimating heterogeneous treatment effects. Enables recovery marginal treatment effect distinct levels `xvar`, e.g. \"child\", \"teenager\", \"adult\". Note \"x\" prefix \"xvar\" represents covariate *interacted* treatment, opposed regular control variable. tref Optional reference value `tvar`. Defaults minimum value (.e., first time period observed dataset). gref Optional reference value `gvar`. need provide `gvar` variable well specified. providing explicit reference value can useful/necessary desired control group takes unusual value. cgroup control group wish use estimating treatment effects. Either \"notyet\" treated (default) \"never\" treated. fe level fixed effects used? Defaults \"vs\" (varying slopes), efficient terms estimation terseness return model object. two options, \"feo\" (fixed effects ) \"none\" (fixed effects whatsoever), trade efficiency additional information (nuisance) model parameters. Note primary treatment parameters interest remain unchanged regardless choice. family [`family`] use estimation. Defaults NULL, case [`fixest::feols`] used. Otherwise passed [`fixest::feglm`], valid entries include \"logit\", \"poisson\", \"negbin\". Note non-NULL family entry detected, `ivar` automatically set NULL. ... Additional arguments passed [`fixest::feols`] ([`fixest::feglm`]). common example `vcov` argument.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extended two-way fixed effects — etwfe","text":"fixest object fully saturated interaction effects.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"heterogeneous-treatment-effects","dir":"Reference","previous_headings":"","what":"Heterogeneous treatment effects","title":"Extended two-way fixed effects — etwfe","text":"Specifying `etwfe(..., xvar = )` generate interaction effects levels `` part main regression model. reason useful (opposed regular, non-interacted covariate formula RHS) allows us estimate heterogeneous treatment effects part larger ETWFE framework. Specifically, can recover heterogeneous treatment effects level `` passing resulting `etwfe` model object `emfx()`. example, imagine categorical variable called \"age\" dataset, two distinct levels \"adult\" \"child\". Running `emfx(etwfe(..., xvar = age))` tell us efficacy treatment varies across adults children. can also leverage -built hypothesis testing infrastructure `marginaleffects` test whether treatment effect statistically different across two age groups; see Examples . Note principles carry categorical variables multiple levels, even continuous variables (although continuous variables well supported yet).","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"performance-tips","dir":"Reference","previous_headings":"","what":"Performance tips","title":"Extended two-way fixed effects — etwfe","text":"situations, `etwfe` complete quickly. part, `emfx` quite performant take seconds less datasets 100k rows. However, `emfx`'s computation time tend scale non-linearly size original data, well number interactions underlying `etwfe` model. Without getting deep weeds, numerical delta method used recover ATEs interest estimate two prediction models ** coefficient model compute standard errors. , potentially expensive operation can push computation time large datasets (> 1m rows) several minutes longer. Fortunately, two complementary strategies can use speed things . first turn expensive part whole procedure---standard error calculation---calling `emfx(..., vcov = FALSE)`. bring estimation time back seconds less, even datasets excess million rows. loss standard errors might acceptable trade-projects statistical inference critical, good news first strategy can still combined second strategy. turns collapsing data groups prior estimating marginal effects can yield substantial speed gains . Users can invoking `emfx(..., collapse = TRUE)` argument. effect dramatic first strategy, second strategy virtue retaining information standard errors. trade- time, however, collapsing data lead loss accuracy estimated parameters. hand, testing suggests loss accuracy tends relatively minor, results equivalent 1st 2nd significant decimal place (even better). Summarizing, quick plan attack try worried estimation time large datasets models: 0. Estimate `mod = etwfe(...)` per usual. 1. Run `emfx(mod, vcov = FALSE, ...)`. 2. Run `emfx(mod, vcov = FALSE, collapse = TRUE, ...)`. 3. Compare point estimates steps 1 2. similar enough satisfaction, get approximate standard errors running `emfx(mod, collapse = TRUE, ...)`.","code":""},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Extended two-way fixed effects — etwfe","text":"Wooldridge, Jeffrey M. (2021). Two-Way Fixed Effects, Two-Way Mundlak Regression, Difference--Differences Estimators. Working paper (version: August 16, 2021). Available: http://dx.doi.org/10.2139/ssrn.3906345 Wooldridge, Jeffrey M. (2022). Simple Approaches Nonlinear Difference--Differences Panel Data. Econometrics Journal (forthcoming). Available: http://dx.doi.org/10.2139/ssrn.4183726","code":""},{"path":[]},{"path":"http://grantmcdermott.com/etwfe/reference/etwfe.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extended two-way fixed effects — etwfe","text":"","code":"# \\dontrun{ # We’ll use the mpdta dataset from the did package (which you’ll need to # install separately). # install.packages(\"did\") data(\"mpdta\", package = \"did\") # # Basic example # # The basic ETWFE workflow involves two steps: # 1) Estimate the main regression model with etwfe(). mod = etwfe( fml = lemp ~ lpop, # outcome ~ controls (use 0 or 1 if none) tvar = year, # time variable gvar = first.treat, # group variable data = mpdta, # dataset vcov = ~countyreal # vcov adjustment (here: clustered by county) ) # mod ## A fixest model object with fully saturated interaction effects. # 2) Recover the treatment effects of interest with emfx(). emfx(mod, type = \"event\") # dynamic ATE a la an event study #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) S #> .Dtreat mean(TRUE) - mean(FALSE) 0 -0.0332 0.0134 -2.48 0.013 6.3 #> .Dtreat mean(TRUE) - mean(FALSE) 1 -0.0573 0.0172 -3.34 <0.001 10.2 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -0.1379 0.0308 -4.48 <0.001 17.0 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -0.1095 0.0323 -3.39 <0.001 10.5 #> 2.5 % 97.5 % #> -0.0594 -0.00701 #> -0.0910 -0.02373 #> -0.1982 -0.07751 #> -0.1729 -0.04619 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # Etc. Other aggregation type options are \"simple\" (the default), \"group\" # and \"calendar\" # # Heterogeneous treatment effects # # Example where we estimate heterogeneous treatment effects for counties # within the 8 US Great Lake states (versus all other counties). gls = c(\"IL\" = 17, \"IN\" = 18, \"MI\" = 26, \"MN\" = 27, \"NY\" = 36, \"OH\" = 39, \"PA\" = 42, \"WI\" = 55) mpdta$gls = substr(mpdta$countyreal, 1, 2) %in% gls hmod = etwfe( lemp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, xvar = gls ## <= het. TEs by gls ) # Heterogeneous ATEs (could also specify \"event\", etc.) emfx(hmod) #> #> Term Contrast .Dtreat gls Estimate Std. Error z #> .Dtreat mean(TRUE) - mean(FALSE) TRUE FALSE -0.0637 0.0376 -1.69 #> .Dtreat mean(TRUE) - mean(FALSE) TRUE TRUE -0.0472 0.0271 -1.74 #> Pr(>|z|) S 2.5 % 97.5 % #> 0.0906 3.5 -0.137 0.01007 #> 0.0817 3.6 -0.100 0.00594 #> #> Columns: term, contrast, .Dtreat, gls, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # To test whether the ATEs across these two groups (non-GLS vs GLS) are # statistically different, simply pass an appropriate \"hypothesis\" argument. emfx(hmod, hypothesis = \"b1 = b2\") #> #> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> b1=b2 -0.0164 0.0559 -0.294 0.769 0.4 -0.126 0.093 #> #> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high #> Type: response #> # # Nonlinear model (distribution / link) families # # Poisson example mpdta$emp = exp(mpdta$lemp) etwfe( emp ~ lpop, tvar = year, gvar = first.treat, data = mpdta, vcov = ~countyreal, family = \"poisson\" ## <= family arg for nonlinear options ) |> emfx(\"event\") #> The variables '.Dtreat:first.treat::2006:year::2004', '.Dtreat:first.treat::2006:year::2005' and eight others have been removed because of collinearity (see $collin.var). #> #> Term Contrast event Estimate Std. Error z Pr(>|z|) #> .Dtreat mean(TRUE) - mean(FALSE) 0 -25.35 15.9 -1.5942 0.111 #> .Dtreat mean(TRUE) - mean(FALSE) 1 1.09 41.8 0.0261 0.979 #> .Dtreat mean(TRUE) - mean(FALSE) 2 -75.12 22.3 -3.3696 <0.001 #> .Dtreat mean(TRUE) - mean(FALSE) 3 -101.82 28.1 -3.6234 <0.001 #> S 2.5 % 97.5 % #> 3.2 -56.5 5.82 #> 0.0 -80.9 83.09 #> 10.4 -118.8 -31.43 #> 11.7 -156.9 -46.75 #> #> Columns: term, contrast, event, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted #> Type: response #> # }"},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-035","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.5","title":"etwfe 0.3.5","text":"CRAN release: 2023-12-01","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-5","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.5","text":"Update tests match upstream changes fixest. Update maintainer email address.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-034","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.4","title":"etwfe 0.3.4","text":"CRAN release: 2023-06-19","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-4","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.4","text":"Update tests match upstream changes marginaleffects (#36 @vincentarelbundock).","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-033","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.3","title":"etwfe 0.3.3","text":"CRAN release: 2023-05-27","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-3","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.3","text":"Rejigged internal tests following upstream changes marginaleffects (#35 @vincentarelbundock).","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-032","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.2","title":"etwfe 0.3.2","text":"CRAN release: 2023-05-02","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"bug-fixes-0-3-2","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"etwfe 0.3.2","text":"Fixed internal centering procedure handling multiple covariate levels (#30, #31). fixes impact main ATT estimates (.e., typical use package). may lead differences heterogeneous ATTs—.e., via xvar arg—incorrectly estimated cases. Thanks @PhilipCarthy flagging @frederickluser helpful discussions Fixed internal upstream bug causing model offsets error (#28, thanks @mariofiorini initial report several others helpful discussion). Note fix requires insight >= 0.19.1.8, development version time writing. information available : https://github.com/easystats/insight/pull/759","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-031","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.1","title":"etwfe 0.3.1","text":"CRAN release: 2023-02-28","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"internal-0-3-1","dir":"Changelog","previous_headings":"","what":"Internal","title":"etwfe 0.3.1","text":"Minor updates internal code unit tests match forthcoming updates marginaleffects 0.10.0. latter update also brings notable performance improvements emfx().","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"other-0-3-1","dir":"Changelog","previous_headings":"","what":"Other","title":"etwfe 0.3.1","text":"documentation improvements. Examples wrapped \\dontrun avoid triggering CRAN NOTEs Windows exceeding 5 seconds execution time. Note package homepage still runs Examples users want inspect output online.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-030","dir":"Changelog","previous_headings":"","what":"etwfe 0.3.0","title":"etwfe 0.3.0","text":"CRAN release: 2023-02-08","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"new-features-and-enhancements-0-3-0","dir":"Changelog","previous_headings":"","what":"New features and enhancements","title":"etwfe 0.3.0","text":"Support estimating heterogeneous treatment effects via new etwfe(..., xvar = argument (#16, thanks @frederickluser). Automatically extends emfx() via latter’s by_xvar argument (#21). details provided dedicated “Heterogeneous treatment effects” section vignette help documentation new emfx(..., collapse = TRUE) argument can substantially reduce estimation times large datasets (#19, thanks @frederickluser). performance boost trade loss estimate accuracy. testing suggests difference relatively minor typical use cases (.e., results equivalent 1st 2nd significant decimal place, sometimes even better). Please let us know find edge cases true. details available dedicated “Performance tips” section vignette help documentation, including advice combining collapsing emfx(..., vcov = FALSE) (yields even dramatic speed boost cost reporting standard errors). Users can now use 1 fml RHS indicate control variables part etwfe call, e.g. etwfe(y ~ 1, ...). provides second way indicating controls, alongside existing 0 option, e.g. etwfe(y ~ 0, ...)","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"bug-fixes-0-3-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"etwfe 0.3.0","text":"Internal code tests updated account upstream breaking changes marginaleffects 0.9.0 (#20, thanks @vincentarelbundock). user side, notable changes longer call summary() emfx objects pretty printing, (former) “dydx” column resulting object now named “estimate”. changes reflected updated documentation.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"other-0-3-0","dir":"Changelog","previous_headings":"","what":"Other","title":"etwfe 0.3.0","text":"Various documentation improvements. example, aforementioned sections Heterogeneous TEs Performance tips. also removed warnings use time-varying controls (#17). truth, can’t quite recall included warnings first place testing confirms appear pose problem ETWFE framework. Thanks Felix Pretis prompting revisit implicit restriction, including forwarding relevant correspondence Prof. Wooldridge. data.table added Imports thus becomes direct dependency. already indirect dependency marginaleffects. ’s now possible install development version package R-universe. Details provided README.","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-020","dir":"Changelog","previous_headings":"","what":"etwfe 0.2.0","title":"etwfe 0.2.0","text":"CRAN release: 2023-01-11","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"bug-fixes-and-breaking-changes-0-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes and breaking changes","title":"etwfe 0.2.0","text":".Dtreat indicator variable created etwfe call now logical instead integer (#14). fix yields slightly different effect sizes emfx output applied non-linear model families (e.g., etwfe(..., family = \"poisson\"). reason now implicitly calling marginaleffects::comparisons hood rather marginaleffects::marginaleffects. Note main etwfe coefficients (family) unaffected, also true emfx applied linear model (.e., default). (optional) ivar argument etwfe() moved argument order list second position fifth (.e., data argument). means four required arguments function now occupy top positions, enable shorter, unnamed notation like etwfe(y ~ x, year, cohort, dat).","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"new-features-and-enhancements-0-2-0","dir":"Changelog","previous_headings":"","what":"New features and enhancements","title":"etwfe 0.2.0","text":"emfx now allows (time-invariant) interacted control variables fml RHS. emfx now post_only logical argument, may useful plotting aesthetics (inference). See example introductory vignette. Various improvements documentation (restructuring, fixed typos, etc.)","code":""},{"path":"http://grantmcdermott.com/etwfe/news/index.html","id":"etwfe-010","dir":"Changelog","previous_headings":"","what":"etwfe 0.1.0","title":"etwfe 0.1.0","text":"CRAN release: 2022-12-14 Initial release.","code":""}]