Who am I?

  • Prof at Université de Montréal
  • 📦 on CRAN for 15 years
  • Recently: Associate Editor of
    the R Journal

Problem 1: R outputs are inconsistent

This simple command:

predict(mod, se.fit = TRUE)


Produces different objects:

  • glm()
    • List of two vectors and one scalar: fit, se.fit, residual.scale
  • MASS::polr()
    • A matrix of estimates with no standard errors.

Problem 2: Stats is hard

Logistic regression model:

library(marginaleffects)

mod <- glm(
    Bin ~ Num * Cat,
    family = binomial,
    data = dat)

An ultra-simple model, yet suprisingly hard to interpret.

(1)
(Intercept) -1.490
(0.327)
Num 0.189
(0.330)
CatB 0.840
(0.424)
CatC 3.142
(0.455)
Num × CatB -0.339
(0.451)
Num × CatC -0.402
(0.430)

Problem 2: Stats is hard

  1. People are terrible at interpreting probabilities.
  2. Interpreting a logit coefficient is much harder:

\[\ln \left[ \left (\frac{p_{1}}{1-p_1}\right ) / \left (\frac{p_0}{1-p_0} \right ) \right]\]

There are logit-specific tricks, but what about interactions, splines, multinomial logit, or XGBoost?

¯\_(ツ)_/¯

(1)
(Intercept) -1.490
(0.327)
Num 0.189
(0.330)
CatB 0.840
(0.424)
CatC 3.142
(0.455)
Num × CatB -0.339
(0.451)
Num × CatC -0.402
(0.430)

Post-hoc transformation

“A parameter is just a resting stone on the road to prediction.”

-Philip Dawid (via Stephen Senn)

In other words:

“Parameter estimates are usually very difficult (or impossible) to interpret as-is. We must transform them into quantities that stakeholders will understand and care about.” -Vincent (citing myself)

Professor of Statistics
University of Cambridge

marginaleffects 📦

  • Easy
  • Consistent
  • Flexible
  • 100+ model types
    • GLM, GAM, Bayesian, Mixed-effects, tidymodels, mlr3

Free online book with 30+ chapters and case studies:

https://marginaleffects.com

marginaleffects 📦

  1. Predictions
  2. Counterfactual comparisons
  3. Slopes

Hypothesis tests to compare all those quantities.


Consistent function names
predictions() comparisons() slopes()
avg_predictions() avg_comparisons() avg_slopes()
plot_predictions() plot_comparisons() plot_slopes()

hypotheses(hypothesis = )

Coefficients:

coef(mod)
(Intercept)         Num        CatB        CatC    Num:CatB    Num:CatC 
 -1.4900862   0.1888537   0.8398436   3.1417172  -0.3387608  -0.4018618 

Null hypothesis: \(\beta_3=\beta_4\)

hypotheses(mod, hypothesis = "b3 = b4")

    Term Estimate Std. Error     z Pr(>|z|)    S 2.5 % 97.5 %
 b3 = b4     -2.3      0.417 -5.53   <0.001 24.9 -3.12  -1.49

Null hypothesis: Non-linear

hypotheses(mod, hypothesis = "b3^2 - log(b4) = 0")

               Term Estimate Std. Error      z Pr(>|z|)   S 2.5 % 97.5 %
 b3^2 - log(b4) = 0   -0.439      0.644 -0.683    0.495 1.0  -1.7  0.822

🔪🩸💀 feature: hypothesis argument

Hypothesis tests on:

  • Predictions
  • Counterfactual comparisons
  • Slopes

Scientific questions:

  • Is the predicted probability of survival equal for men and women?
  • Is the treatment effect size equal for old and young people?
  • Is the effect of treatment A bigger than the effect of treatment B?

predictions()


Base R:

df <- data.frame(Cat = "A", Num = 0)

predict(mod, newdata = df, type = "response")
[1] 0.1839088


Same syntax, but richer results:

predictions(mod, newdata = df)

 Estimate Pr(>|z|)    S 2.5 % 97.5 %
    0.184   <0.001 17.6 0.106  0.299

predictions()

Fitted values for every row in the original data:

predictions(mod)

 Estimate Pr(>|z|)    S 2.5 % 97.5 %
    0.197   <0.001 15.4 0.113  0.320
    0.316   0.0417  4.6 0.180  0.493
    0.185   <0.001 17.6 0.108  0.300
    0.213   0.0016  9.3 0.107  0.379
    0.852   <0.001 20.9 0.744  0.920
--- 190 rows omitted. See ?avg_predictions and ?print.marginaleffects --- 
    0.351   0.0258  5.3 0.240  0.481
    0.829   <0.001 19.6 0.719  0.902
    0.351   0.0248  5.3 0.240  0.481
    0.846   <0.001 22.3 0.743  0.913
    0.314   0.0460  4.4 0.176  0.497

avg_predictions()


Average of the fitted values:

avg_predictions(mod)

 Estimate Pr(>|z|)   S 2.5 % 97.5 %
     0.49     0.82 0.3 0.404  0.577


Average by subgroup:

avg_predictions(mod, by = "Cat")

 Cat Estimate Pr(>|z|)    S 2.5 % 97.5 %
   A    0.186   <0.001 17.6 0.108  0.301
   B    0.344   0.0166  5.9 0.236  0.471
   C    0.843   <0.001 22.6 0.741  0.909

avg_predictions(hypothesis = )

avg_predictions(mod, 
    by = "Cat"
)

 Cat Estimate Pr(>|z|)    S 2.5 % 97.5 %
   A    0.186   <0.001 17.6 0.108  0.301
   B    0.344   0.0166  5.9 0.236  0.471
   C    0.843   <0.001 22.6 0.741  0.909


Are average predictions for Cat A and Cat B equal? 🔪🩸💀

avg_predictions(mod, 
    by = "Cat",
    hypothesis = "b1 = b2"
)

  Term Estimate Std. Error     z Pr(>|z|)   S  2.5 %   97.5 %
 b1=b2   -0.157     0.0778 -2.01   0.0439 4.5 -0.309 -0.00426

plot_predictions()

plot_predictions(mod, condition = c("Num", "Cat"))

Causal inference

Counterfactual comparisons

Contrast, risk difference, risk ratio, odds, lift, etc.

Counterfactual comparisons()

Functions of two predictions. “All else equal” model-based comparisons:

\[\hat{Y}_{X=1} - \hat{Y}_{X=0}\]

\[\hat{Y}_{X=x + 1} - \hat{Y}_{X=x}\] \[\hat{Y}_{X=x + \sigma_X} - \hat{Y}_{X=x}\]

\[\frac{\hat{Y}_{X=1}}{\hat{Y}_{X=0}}\] \[\frac{\hat{Y}_{X=1} - \hat{Y}_{X=0}}{\hat{Y}_{X=0}}\]

Counterfactual comparisons()

One estimate per row:

comparisons(mod)


Average risk difference for changes in each predictor:

avg_comparisons(mod)

 Term          Contrast Estimate Std. Error     z Pr(>|z|)    S    2.5 % 97.5 %
  Cat mean(B) - mean(A)   0.1600     0.0776  2.06   0.0392  4.7  0.00795 0.3120
  Cat mean(C) - mean(A)   0.6531     0.0645 10.13   <0.001 77.7  0.52667 0.7795
  Num mean(+1)           -0.0118     0.0309 -0.38   0.7039  0.5 -0.07235 0.0488

Hypothesis tests

avg_comparisons(mod)

 Term          Contrast Estimate Std. Error     z Pr(>|z|)    S    2.5 % 97.5 %
  Cat mean(B) - mean(A)   0.1600     0.0776  2.06   0.0392  4.7  0.00795 0.3120
  Cat mean(C) - mean(A)   0.6531     0.0645 10.13   <0.001 77.7  0.52667 0.7795
  Num mean(+1)           -0.0118     0.0309 -0.38   0.7039  0.5 -0.07235 0.0488

Is the average effect of “B versus A” equal to the average effect of “C versus A”? 🔪🩸💀

avg_comparisons(mod, hypothesis = "b1 = b2")

  Term Estimate Std. Error     z Pr(>|z|)    S  2.5 % 97.5 %
 b1=b2   -0.493     0.0742 -6.64   <0.001 34.9 -0.639 -0.348

marginaleffects.com

marginaleffects.com

  • 100+ models:
    • Bayes, GAM, Mixed effects…
  • Slopes: \(\partial y / \partial x\)
  • ML: tidymodels + mlr3
  • Causal inference
  • Robust standard errors & Bootstrap
  • Conformal prediction
  • Categorical outcomes
  • Equivalence tests
  • Interactions
  • Survey experiments
  • Inverse Probability Weighting
  • Matching
  • Multilevel Regression with Post-stratification
  • Multiple imputation

tinytable

modelsummary