24 Lecture 11: Regression Discontinuity

Slides

11 Difference-in-Difference (link)

24.1 Introduction

RDD is what you do when treatment assignment is driven by a threshold rule.

Running variable: entrance_exam
Cutoff: \(c = 70\)
Treatment (sharp): \(D = 1[\text{entrance\_exam} \ge 70]\)
Outcome: exit_exam

The key idea is local comparison:

Compare units just below 70 to units just above 70.

24.1.1 The math spine

In a sharp RDD, the estimand is the jump in the conditional mean at the cutoff:

\[ \tau_{RDD}= \lim_{x \downarrow 70} \mathbb{E}[Y \mid X = x]- \lim_{x \uparrow 70} \mathbb{E}[Y \mid X = x] \]

Interpretation: \(\tau_{RDD}\) is the treatment effect right at the cutoff (not necessarily far away from 70).

24.2 Vignette A: Tutoring program (sharp RDD at 70)

In this hypothetical example, students take an entrance exam at the beginning of a school year. Those who score 70 or below are automatically enrolled in a free tutoring program and receive assistance throughout the year. At the end of the school year, students take a final test, or exit exam (with a maximum of 100 points) to measure how much they learned overall. Remember, this is a hypothetical example and tests like this don’t really exist, but just go with it.

library(tidyverse)
library(modelsummary)
library(rdrobust)
library(sjPlot)
library(marginaleffects)

tut <- read_csv("Sample_data/tutoring_program.csv", show_col_types = FALSE) %>%
  mutate(
    cutoff = 70,
    x = entrance_exam,
    y = exit_exam,
    x_c = x - cutoff,
    treat = as.integer(x >= cutoff)
  )

tut %>% summarize(min_x = min(x, na.rm=TRUE), max_x = max(x, na.rm=TRUE))

## # A tibble: 1 × 2
##   min_x max_x
##   <dbl> <dbl>
## 1  28.8  99.8

tut %>% count(treat)

## # A tibble: 2 × 2
##   treat     n
##   <int> <int>
## 1     0   237
## 2     1   763

24.2.1 A.1 Picture first: binned means around the cutoff

Before we estimate anything, we want to see the discontinuity.

A simple way: bin the running variable and plot mean outcomes in each bin.

## Pick a bin width that gives a readable plot
bin_width <- 2

binned <- tut %>%
  mutate(bin = floor((x - min(x, na.rm=TRUE)) / bin_width)) %>%
  group_by(bin) %>%
  summarize(
    x_bin = mean(x, na.rm=TRUE),
    y_bin = mean(y, na.rm=TRUE),
    .groups = "drop"
  )

ggplot(binned, aes(x = x_bin, y = y_bin)) +
  geom_point(alpha = 0.6) +
  geom_line(alpha = 0.6) +
  geom_vline(xintercept = 70, linetype = "dashed") +
  labs(x = "Entrance exam (running variable)", y = "Mean exit exam (binned)",
       caption = "Dashed line = cutoff at 70") +
  theme_minimal()

That plot is suggestive, but it is not an estimator. Now we estimate the jump formally.

24.2.2 A.2 Estimate the discontinuity with `rdrobust()`

rdrobust() focuses on observations near the cutoff and uses local polynomial regression.

We’ll estimate two versions (same idea, slightly different choices):

Automatic bandwidth choice (a standard default workflow).
A fixed bandwidth and linear fit (so students see what changes when we change the “window”).

m1 <- rdrobust(y = tut$y, x = tut$x, c = 70,
               kernel = "triangular")

m2 <- rdrobust(y = tut$y, x = tut$x, c = 70,
               kernel = "triangular", h = 10, p = 1)

summary(m1)

## Sharp RD estimates using local polynomial regression.
## 
## Number of Obs.                 1000
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.                  237          763
## Eff. Number of Obs.             144          256
## Order est. (p)                    1            1
## Order bias  (q)                   2            2
## BW est. (h)                   9.969        9.969
## BW bias (b)                  14.661       14.661
## rho (h/b)                     0.680        0.680
## Unique Obs.                     155          262
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional    -8.578     1.601    -5.359     0.000   [-11.715 , -5.441]    
##         Robust         -         -    -4.352     0.000   [-12.101 , -4.587]    
## =============================================================================

summary(m2)

## Sharp RD estimates using local polynomial regression.
## 
## Number of Obs.                 1000
## BW type                      Manual
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.                  237          763
## Eff. Number of Obs.             144          256
## Order est. (p)                    1            1
## Order bias  (q)                   2            2
## BW est. (h)                  10.000       10.000
## BW bias (b)                  10.000       10.000
## rho (h/b)                     1.000        1.000
## Unique Obs.                     237          763
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional    -8.581     1.598    -5.369     0.000   [-11.714 , -5.448]    
##         Robust         -         -    -3.284     0.001   [-12.479 , -3.150]    
## =============================================================================

24.2.2.1 Interpretation

The key quantity is the estimated jump at the cutoff (the treatment effect at \(x=70\)).

A plug-in interpretation sounds like:

“Crossing the cutoff (and getting tutoring) changes exit exam scores by about X points for students right around an entrance score of 70.”

24.2.3 A.3 The built-in RDD plot (`rdplot()`)

This is a convenient diagnostic + presentation plot. It bins the data and overlays the fitted functions on each side.

rdplot(tut$y, tut$x,
       c = 70,
       x.label = "Entrance exam",
       y.label = "Exit exam",
       binselect = "qsmvpr")

## [1] "Mass points detected in the running variable."

rdplot(tut$y, tut$x,
       c = 70,
       p = 1,
       x.label = "Entrance exam",
       y.label = "Exit exam",
       binselect = "qsmvpr")

## [1] "Mass points detected in the running variable."

24.2.4 A.4 “Same plot, but prettier” (rebuild in ggplot)

Sometimes you want the rdplot bins, but in your own plotting style.

rdplot() returns the binned means it used. We can grab those and build a custom plot.

rd_obj <- rdplot(tut$y, tut$x,
                 c = 70,
                 p = 1,
                 x.label = "Entrance exam",
                 y.label = "Exit exam",
                 binselect = "qsmvpr")

## [1] "Mass points detected in the running variable."

rd_bins <- tibble(
  x_bin = rd_obj$vars_bins$rdplot_mean_bin,
  y_bin = rd_obj$vars_bins$rdplot_mean_y
) %>%
  mutate(side = if_else(x_bin >= 70, "Above cutoff", "Below cutoff"))

ggplot(rd_bins, aes(x = x_bin, y = y_bin)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE) +
  geom_vline(xintercept = 70, linetype = "dashed") +
  labs(x = "Entrance exam", y = "Exit exam",
       caption = "Points are rdplot binned means; lines are separate linear fits") +
  theme_minimal()

24.2.5 A.5 Sensitivity: bandwidth and polynomial order

RDD estimates are local—so choices about how local (bandwidth) matter.

We’ll estimate a small grid and put it in a compact table.

specs <- tribble(
  ~h, ~p,
  5,  1,
  10, 1,
  15, 1,
  10, 2
)

sens <- specs %>%
  mutate(
    fit = pmap(list(h, p), \(h, p) rdrobust(y=tut$y, x=tut$x, c=70, h=h, p=p, kernel="triangular")),
    tau = map_dbl(fit, \(m) m$coef[1]),
    se  = map_dbl(fit, \(m) m$se[1]),
    lo  = tau - 1.96*se,
    hi  = tau + 1.96*se
  )

sens %>%
  mutate(spec = paste0("h=", h, ", p=", p)) %>%
  select(spec, tau, se, lo, hi)

## # A tibble: 4 × 5
##   spec        tau    se    lo    hi
##   <chr>     <dbl> <dbl> <dbl> <dbl>
## 1 h=5, p=1  -8.20  2.34 -12.8 -3.61
## 2 h=10, p=1 -8.58  1.60 -11.7 -5.45
## 3 h=15, p=1 -8.84  1.30 -11.4 -6.29
## 4 h=10, p=2 -7.81  2.38 -12.5 -3.15

Interpretation note:

If the estimate changes wildly across reasonable bandwidths, that’s a warning sign.
Usually we want to see the estimate remain qualitatively stable (direction + rough magnitude).

24.2.6 A.6 Credibility checks (what we look for)

24.2.6.1 (1) Sorting/manipulation near the cutoff (visual check)

In a sharp RDD, a big red flag is bunching right above 70 (people gaming the entrance score).

A simple visual check:

tut %>%
  ggplot(aes(x = x)) +
  geom_histogram(bins = 40) +
  geom_vline(xintercept = 70, linetype = "dashed") +
  labs(x = "Entrance exam", y = "Count",
       caption = "Look for unusual bunching right above 70") +
  theme_minimal()

24.2.6.2 (2) Covariate balance near the cutoff (if we have covariates)

If the dataset contains pre-treatment covariates, we can run the same RD logic using each covariate as an outcome. We want covariates to be smooth at 70.

covars <- setdiff(names(tut), c("entrance_exam","exit_exam","cutoff","x","y","x_c","treat"))

covars

## [1] "id"       "tutoring"

If you see reasonable pre-treatment covariates here (e.g., demographics), pick one or two and run:

## Example (turn eval=TRUE once you choose real covariates):
## rdrobust(y = tut$some_covariate, x = tut$x, c = 70, kernel = "triangular") |> summary()

24.3 Vignette B: A second RDD example (simulated, so we know the truth)

The tutoring example is “real-data messy.” A simulated example is useful because: - we know whether there is a true jump, - we can see how bandwidth and functional form interact.

set.seed(123)

n <- 2000
cutoff <- 0

sim <- tibble(
  x = runif(n, -2, 2),
  treat = as.integer(x >= cutoff),
  ## Smooth relationship + a true jump at 0
  y = 50 + 6*x - 2*x^2 + 4*treat + rnorm(n, 0, 3)
)

## Binned plot
sim_bins <- sim %>%
  mutate(bin = cut(x, breaks = seq(-2, 2, by = 0.1), include.lowest = TRUE)) %>%
  group_by(bin) %>%
  summarize(x_bin = mean(x), y_bin = mean(y), .groups="drop")

ggplot(sim_bins, aes(x = x_bin, y = y_bin)) +
  geom_point(alpha = 0.6) +
  geom_line(alpha = 0.6) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  labs(x = "Running variable (centered)", y = "Outcome (binned means)",
       caption = "True jump at 0 is +4") +
  theme_minimal()

## rdrobust estimate
sim_m <- rdrobust(y = sim$y, x = sim$x, c = 0, kernel = "triangular")
summary(sim_m)

## Sharp RD estimates using local polynomial regression.
## 
## Number of Obs.                 2000
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.                 1010          990
## Eff. Number of Obs.             361          355
## Order est. (p)                    1            1
## Order bias  (q)                   2            2
## BW est. (h)                   0.696        0.696
## BW bias (b)                   1.136        1.136
## rho (h/b)                     0.613        0.613
## Unique Obs.                    1010          990
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     3.515     0.476     7.383     0.000     [2.582 , 4.448]     
##         Robust         -         -     6.057     0.000     [2.290 , 4.481]     
## =============================================================================

Interpretation: - Because we built the data with a true jump of +4, a good RDD estimate (with enough data near the cutoff) should recover something close.

24.4 Exercises

In the tutoring example, identify \(X\), \(c\), \(D\), and \(Y\).
What does the RDD estimate mean in words? (Be explicit: “for whom” is this causal effect?)
Compare two bandwidths (e.g., \(h=5\) vs \(h=15\)). What changed in the estimate and the uncertainty?
What pattern in the histogram around 70 would worry you about manipulation?

23 Lecture 10: Difference-in-Difference