noalign <- function(x) {
    x <- tinytable::theme_tinytable(x)
    fn <- function(table) {
        if (table@output != "typst") {
            return(table)
        }
        tab <- unlist(strsplit(table@table_string, "\\n"))
        idx <- grepl("^\\s*#align\\(center, \\[\\s*$|^\\s*\\]\\) // end align\\s*$", tab)
        table@table_string <- paste(tab[!idx], collapse = "\n")
        return(table)
    }
    x <- tinytable::style_tt(x, finalize = fn)
    return(x)
}
options(tinytable_tt_theme = noalign)

Making Sense of Sensitivity: Extending Omitted Variable Bias

@CinHaz2020 ask: How strong do omitted confounders have to be in order to change our coefficient estimate by q%? To answer this question, we start with a familiar setup, where we want to estimate the effect of treatment 𝐷 on outcome π‘Œ in the presence of confounder 𝑍. The true model is:

π‘Œ=𝜏𝐷+𝑋𝛽+𝛾𝑍+πœ€

The omitted variable bias formula is well-known in the ordinary least squares context:

πœΜ‚π‘Ÿπ‘’π‘ =πœΜ‚+𝛾̂𝛿̂,

where πœΜ‚π‘Ÿπ‘’π‘  is the observed estimate; πœΜ‚ is the desired estimate; 𝛾̂ is a measure of the association between the omitted 𝑍 and π‘Œ (β€œimpact”); and 𝛿̂ is a measure of the association between the omitted 𝑍 and the treatment 𝐷 (β€œimbalance”).

The equivalence above can be seen in a simple simulation:

library(modelsummary)

N = 10000
Z = rbinom(N, 1, prob = .5)
D = rbinom(N, 1, prob = .8 - .6 * Z)
Y = 1 * D + 3 * Z + rnorm(N)

mod = list(
  "Correct" = lm(Y ~ D + Z),
  "Confounded" = lm(Y ~ D),
  "Auxiliary" = lm(Z ~ D))

coef(mod$Confounded)["D"] 

coef(mod$Correct)["D"] + coef(mod$Correct)["Z"] * coef(mod$Auxiliary)["D"]
         D 
-0.8058697
         D 
-0.8058697
modelsummary(mod, gof_omit = ".*")
CorrectConfoundedAuxiliary
(Intercept)0.0082.4040.801
(0.025)(0.022)(0.006)
D1.002βˆ’0.806βˆ’0.604
(0.025)(0.031)(0.008)
Z2.991
(0.025)

Knowing the signs of the bias components can be useful, but the magnitudes of 𝛾 and 𝛿 obviously matter as well. Moreover, the omitted variable bias formula with one confounder is of limited use when several confounders act together, possibly in non-linear fashion.

To assess how strong all omitted confounders need to be to overturn our conclusions, @CinHaz2020 recommend reparameterizing the bias in terms of partial 𝑅2. They show that our simple expression of the bias can be generalized and expressed as:

|π‘π‘–π‘Žπ‘ Μ‚|=(π‘…π‘ŒβˆΌπ‘|𝐷,𝑋2π‘…π·βˆΌπ‘|𝑋21βˆ’π‘…π·βˆΌπ‘|𝑋2)𝑠𝑑(π‘ŒβŸ‚π‘‹,𝐷)𝑠𝑑(π·βŸ‚π‘‹)

where π‘…π‘ŒβˆΌπ‘|𝐷,𝑋2 et al.Β represent partial 𝑅2 values.

The above equation would be sufficient to conduct a full sensitivity analysis, but it is often convenient to report a single β€œRobustness Value”. To simplify the interpretation, the authors consider a critical case where the strength of the impact (effect of Z on Y) and the strength of the imbalance (effect of Z on D) are equal. This allows them to define a simple Robustness Value:

π‘…π‘‰π‘ž=12{(π‘“π‘ž4+4π‘“π‘ž2)βˆ’π‘“π‘ž2}

In this equation, π‘“π‘žβ‰”π‘ž|π‘“π‘ŒβˆΌπ·|𝑋|, where π‘“π‘ŒβˆΌπ·|𝑋 represents the partial Cohen’s 𝑓 of the treatment with the outcome,1 and π‘ž is β€œthe proportion of reduction q on the treatment coefficient which would be deemed problematic.”

The interpretation is quite straightforward:

β€œConfounders that explain π‘…π‘‰π‘ž% both of the treatment and of the outcome are sufficiently strong to change the point estimate in problematic ways, whereas confounders with neither association greater than π‘…π‘‰π‘ž% are not.”

If π‘…π‘‰π‘ž is close to 1, the estimate can sustain confounding: the counfounders would need to explain nearly 100% of both the treatment and the outcome to be problematic. In contrast, if π‘…π‘‰π‘ž is close to 0, then our estimate cannot sustain confounding.

In R, we can calculate π‘…π‘‰π‘ž like this:

library(broom)

get_RV = function(mod, treatment = "D", q = 1) {
  ti = tidy(mod)
  gl = glance(mod)
  tvalue = ti$statistic[ti$term == treatment]
  df = gl$df.residual
  fq = abs(tvalue / sqrt(df)) * q
  RV = 1/2 * (sqrt(fq^4 + 4 * fq^2) - fq^2)
  RV
}
package β€˜broom’ was built under R version 4.5.2
RV100 = get_RV(mod$Confounded, treatment = "D", q = 1) * 100
RV25 = get_RV(mod$Confounded, treatment = "D", q = .25) * 100

For example, the 𝑅𝑉1 for the confounded model that we estimated at the top of this notebook is:

get_RV(mod$Confounded, treatment = "D", q = 1)
[1] 0.2275115

Confounders which explain r sprintf("%.0f%%", RV100) of both 𝑋 and 𝑍 are sufficiently strong to change the estimated treatment effect by 100%.

The 𝑅𝑉1/4 is:

get_RV(mod$Confounded, treatment = "D", q = .25)
[1] 0.06265379

Confounders which explain r sprintf("%.0f%%", RV25) of both 𝑋 and 𝑍 are sufficiently strong to change the estimated treatment effect by 25%.

The sensemakr package allows us to compute π‘…π‘‰π‘ž easily, without having to cook up our own functions:

library(sensemakr)

s = sensemakr(mod$Confounded, treatment = "D", q = .25)

s
Sensitivity Analysis to Unobserved Confounding

Model Formula: Y ~ D

Null hypothesis: q = 0.25 and reduce = TRUE 

Unadjusted Estimates of ' D ':
  Coef. estimate: -0.80587 
  Standard Error: 0.03114 
  t-value: -6.47074 

Sensitivity Statistics:
  Partial R2 of treatment with outcome: 0.0628 
  Robustness Value, q = 0.25 : 0.06265 
  Robustness Value, q = 0.25 alpha = 0.05 : 0.0441 

For more information, check summary.

As stated above, π‘…π‘‰π‘ž characterizes the special, critical case where the impact and imbalance are equal. To see what happens for different combinations of impact and imbalance, the sensemakr package allows us to draw nice contour plots. The red line shows us the combinations of confounding magnitude that would allow a change of 25% in the estimate (as determined by the q argument in the sensemakr call above).

plot(s)
  1. 1This can be computed by dividing coefficient t-value by 𝑑𝑓