--- title: "Using ivcheck with fixest" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Using ivcheck with fixest} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "figures/with-fixest-" ) set.seed(1) ``` ## Why this vignette matters `fixest` is the dominant R package for applied IV estimation. This vignette shows the drop-in integration: fit an IV model with `feols()`, pass it to `iv_check()`, and get every applicable IV-validity test in one call. If you are already using `fixest` for your paper, nothing about your workflow changes. Add one line and your IV estimate now comes with a published falsification test. ## Setup ```{r, message = FALSE} library(ivcheck) library(fixest) ``` ## The Card (1995) proximity-to-college IV Card's (1995) classic IV for the return to schooling uses proximity to a four-year college as an instrument for completed schooling. The bundled `card1995` dataset is a cleaned extract from the National Longitudinal Survey of Young Men. ```{r} data(card1995) head(card1995[, c("lwage", "educ", "college", "near_college", "age", "black", "south")]) ``` Two variants are included: the continuous `educ` (years of schooling) and a binary `college` indicator (`educ >= 16`) for use with tests that require a binary treatment. ## Fit the IV regression ```{r} m <- feols( lwage ~ age + black + south | college ~ near_college, data = card1995 ) summary(m) ``` The endogenous variable `college` is instrumented by `near_college`. The first-stage F is strong. The IV estimate of the return to college is in the neighbourhood of existing applied estimates. ## Run every applicable IV-validity test ```{r} chk <- iv_check(m, n_boot = 500, parallel = FALSE) print(chk) ``` `iv_check()` inspects the model, detects that `college` is binary and `near_college` is a discrete instrument, and runs Kitagawa (2015) and Mourifie-Wan (2017). Neither test rejects; the IV passes. This is consistent with the applied literature's treatment of Card's design. ## Dispatching directly on the model If you want to run a single test rather than the full suite, each function dispatches on `fixest` objects too: ```{r} iv_kitagawa(m, n_boot = 300, parallel = FALSE) ``` The function extracts `y`, `d`, and `z` from the fitted model (including the first stage) and runs the test. You never touch the raw vectors. ## Inspecting the bootstrap distribution ```{r, fig.width = 6, fig.height = 4} k <- iv_kitagawa(m, n_boot = 500, parallel = FALSE) hist(k$boot_stats, breaks = 40, main = "Kitagawa bootstrap distribution (Card 1995)", xlab = "sqrt(n) * positive-part KS") abline(v = k$statistic, col = "red", lwd = 2) ``` The observed statistic (red line) sits well inside the bootstrap distribution, consistent with a non-rejection. ## Combining with `modelsummary` If you have `modelsummary` installed, `iv_check` results are picked up automatically through `broom::glance` registered on package load. This lets you put a validity p-value directly in a regression table footer: ```{r, eval = FALSE} library(modelsummary) modelsummary( list("IV estimate" = m), gof_custom = list( "Kitagawa 2015 p-value" = sprintf("%.3f", k$p_value) ) ) ``` ## The full workflow In your paper's replication code: ```{r, eval = FALSE} library(fixest) library(ivcheck) # ... data loading ... # IV estimate m <- feols(y ~ controls | d ~ z, data = df) # IV validity diagnostic chk <- iv_check(m) # Report both in the paper knitr::kable(chk$table) ``` Three lines of code, a falsification test the referee is almost guaranteed to ask about, and a citation-ready result. That is the whole point of `ivcheck`. ## References Card, D. (1995). Using Geographic Variation in College Proximity to Estimate the Return to Schooling. Kitagawa, T. (2015). A Test for Instrument Validity. *Econometrica* 83(5): 2043-2063.