Assessing Residual Diagnostics with the Lineup Protocol
2024-08-07
Diagnostics: is anything importantly wrong with my model?
\[\underbrace{\boldsymbol{e}}_\textrm{Residuals} = \underbrace{\boldsymbol{y}}_\textrm{Observations} - \underbrace{f(\boldsymbol{x})}_\textrm{Fitted values}\]
Residuals: what the regression model does not capture.
Checked by:
Residual plots are usually revealing when the assumptions are violated. βDraper and Smith (1998), Belsley, Kuh, and Welsch (1980)
Graphical methods are easier to use. βCook and Weisberg (1982)
Residual plots are more informative in most practical situations than the corresponding conventional hypothesis tests. βMontgomery and Peck (1982)
What do you see?
We need an inferential framework to calibrate expectations when reading residual plots!
Suggested by Buja et al. (2009)
Typically, a lineup of residual plots consists of
To perform a visual test
Compare conventional hypothesis testing with visual testing when evaluating residual plots
Model | Structure |
---|---|
Null | \(\boldsymbol{y} = \beta_0 + \beta_1\boldsymbol{x} + \boldsymbol{\varepsilon}\) |
Non-linearity | \(\boldsymbol{y} = \boldsymbol{1} + \boldsymbol{x} + \boldsymbol{z} + \boldsymbol{\varepsilon}\) |
Heteroskedasticity | \(\boldsymbol{y} = 1 + \boldsymbol{x} + \boldsymbol{\varepsilon}_h\) |
where
Non-linearity
Heteroskedasticity
parameters controlling signal strength (\(\sigma, b, n\))
\(4\times 4\times 3 \times 4 = 192\) non-linear parameter sets
\(3\times 5\times 3 \times 4 = 180\) heterosked. parameter sets
3 replicates per parameter set
576 (non linearity) + 540 (heteroskedasticity) lineups
(w/ \(\geq\) 5 evaluations\()\))
36 Rorshach lineups to estimate \(\alpha\) for p-value calcs
Violation | Test |
---|---|
nonlinearity | RESET |
heteroskedasticity | Breusch-Pagan |
goodness-of-fit | Shapiro-Wilk |
β‘οΈ Make the π₯οΈ do it for us with πͺComputer Vision π€
Estimate βvisualβ distance \(D\) between
Compare \(\widehat D\) to a distribution of values
Calibrate against visual and conventional test results
How to measure βdifferenceβ/βdistanceβ between plots?
\(\widehat{D} = f_{CV}(V_{h \times w}(\boldsymbol{e}), n, S(\hat y, \hat e))\), where
Non-linearity + Heteroskedasticity
Non-normality + Heteroskedasticity
Distribution of predictor
Estimate null distribution \(F(D | H_0)\) empirically:
Compute critical value \(Q_0(0.95)\) as the value \(Q\) s.t. \(F(D \leq Q | H_0) = 0.95\)
\(p\)-value: \(P(D \geq D^\ast)\) for observed \(D^\ast\)
autovi
PackageThe autovi
package provides automated visual inference with computer vision models. It is available on CRAN and Github.
rotate_resid()
vss()
check()
and summary_plot()
Null residuals are simulated from the fitted model assuming it is correctly specified.
checker <- auto_vi(fitted_model = fitted_model,
keras_model = get_keras_model("vss_phn_32"))
checker$rotate_resid()
# A tibble: 489 Γ 2
.fitted .resid
<dbl> <dbl>
1 632372. 24372.
2 525177. 13236.
3 646753. 54824.
4 624848. -98465.
5 611817. 188264.
6 551051. -67975.
7 504757. 142250.
8 445700. -175323.
9 281912. -101298.
10 453398. -121730.
# βΉ 479 more rows
:::
vss()
check()
ββ <AUTO_VI object>
Status:
- Fitted model: lm
- Keras model: (None, 32, 32, 3) + (None, 5) -> (None, 1)
- Output node index: 1
- Result:
- Observed visual signal strength: 6.484 (p-value = 0)
- Null visual signal strength: [100 draws]
- Mean: 1.169
- Quantiles:
ββββββββββββββββββββββββββββββββββββββββββββ
β 25% 50% 75% 80% 90% 95% 99% β
β1.037 1.120 1.231 1.247 1.421 1.528 1.993 β
ββββββββββββββββββββββββββββββββββββββββββββ
- Bootstrapped visual signal strength: [100 draws]
- Mean: 6.28 (p-value = 0)
- Quantiles:
ββββββββββββββββββββββββββββββββββββββββββββ
β 25% 50% 75% 80% 90% 95% 99% β
β5.960 6.267 6.614 6.693 6.891 7.112 7.217 β
ββββββββββββββββββββββββββββββββββββββββββββ
- Likelihood ratio: 0.7064 (boot) / 0 (null) = Extremely large
RESET \(p\)-value = 0.742
B-P \(p\)-value = 0.36
S-W \(p\)-value = 9.21e-05
:::
Donβt want to install TensorFlow
?
Try our shiny web application: https://shorturl.at/DNWzt
For GLMs and other regression models:
You can use autovi
to
Evaluate lineups of residual plots of linear regression models
Capture the magnitude of model violations through visual signal strength
Automatically detect model misspecification using a visual test
Advisors