Multimodal Graphical Testing

Susan Vanderplas

2023-12-14

Outline

How Do We Test Graphics?

Lineups

Question: Can participants identify different growth rates on a linear scale?

A “Visual Hypothesis Test”

  • Embed the question in array of charts

  • Can people identify the different plot?

  • Null model can be tricky to create

  • Test statistic is the visual evaluation

Buja et al. (2009)
Loy and Hofmann (2013)
Majumder, Hofmann, and Cook (2013)
Vanderplas and Hofmann (2017)
VanderPlas et al. (2021)

Numerical Estimation

Eells (1926)

von Huhn (1927)
  • Size of region?
    Eells (1926); Croxton and Stryker (1927); VanderPlas, Goluch, and Hofmann (2019)
  • With scales?
    von Huhn (1927)
  • Size of relationship compared to another region
    Croxton and Stein (1932)
  • Very sensitive to question phrasing

Forced Choice

Which bar is larger? Hughes (2001)

Hegarty, Smallman, and Stull (2012)
  • Force participants to answer a specific question

  • May be a size judgment (which is larger?)

    • common in psychophysics experiments
  • May be a more complex decision incorporating other information

Hughes (2001)
Xiong et al. (2020)
Lu et al. (2022)

Eye Tracking

  • Infer cognitive processes from directed (conscious) attention

  • May be accompanied by direct estimation or other protocols

Gegenfurtner, Lehtinen, and Säljö (2011)
J. Goldberg and Helfman (2011)
Zhao et al. (2013)
Netzel et al. (2017)
Liu, Liu, and Tan (2023)

J. H. Goldberg and Helfman (2010)

Woller-Carter et al. (2012)

Think Aloud and Free Response

  • Stream of consciousness narration Guan et al. (2006; Cooke 2010){.smaller}

  • Reasoning to justify a decision

Why did you choose this panel? Vanderplas and Hofmann (2017)

Direct Annotation

Mosteller et al. (1981)
  • Have participants visually fit statistics

    • Usually directly annotating the chart with e.g. a regression line
  • Compare visual statistics to numerical calculations

  • Differences tell us about our implicit perception of data
    e.g. visual regression is more robust to outliers

  • Also useful as a teaching tool

Bajgier, Atkinson, and Prybutok (1989)
Robinson, Howard, and Vanderplas (2022)
Robinson, Howard, and Vanderplas (2023b)

How Do We Test Graphics?

  • Testing method needs to be matched to level of engagement

  • Need to examine graphical choices across levels of engagement

Multimodal Graphical Testing
in Practice

Perception: Exp. Growth & Log Scales

3 different ways of engaging with the data

Can we

  • Q1: perceive differences in     … Perceptual
  • Q2: forecast trends from     … Tactile
  • Q3: estimate and use     … Numerical

graphs of exponential growth with log and linear scales?

300 participants completed all 3 experiments

Q1: Perception of Differences

Q1: Perception of Differences

Log Scale

Linear Scale

Q1: Perception of Differences

Conclusion: It’s easier to spot a curve among lines than it is to spot a line among curves

Robinson, Howard, and Vanderplas (2023a ){.smaller}[Under review]

Q2: Inspiration

Q2: Forecasting (You-Draw-It) Goals

  1. Replicate Eye Fitting Straight Lines using the you-draw-it tool (4 charts) Robinson, Howard, and Vanderplas (2022)

  2. Explore exponential growth predictions on log and linear scale (8 charts)

    • Points end 50% or 75% of the way across x-axis
    • Rate of growth of \(\beta\) = 0.1, 0.23
    • Log or Linear scale

12 total graphs to complete

Q2: Forecasting (You-Draw-It)

Data from Mosteller et al. (1981)

Exp data, linear scale, 50% complete

Q2: Forecasting (You-Draw-It)

Q2: Forecasting (You-Draw-It)

Q2: Forecasting (You-Draw-It)

Q3: Numerical Estimation

Q3: Numerical Estimation

  • Next level of engagement is estimating quantities from a graph

  • This is a much harder experiment to set up

    • Phrasing matters a lot!
    • Data matters a lot!

How to make it generalizable?

Q3: Numerical Estimation

  • Use Ewoks and Tribbles - creatures that might multiply exponentially
  • One set on the linear scale, one set on log scale
  • Underlying trend is the same (within transformed x axis)
  • Different variability around the line

Ewoks and Tribbles (with apologies to Allison Horst)

Q3: Numerical Estimation

Free response: Between \(t_1\) and \(t_2\), how does the population of \(X\) change?

 

Linear Scale

 

Log Scale

 

Q3: Numerical Estimation

Estimating Population given a year

Process Sketch

Linear scale

Log scale

Q3: Numerical Estimation

Estimating Population given a year

Deviation from Closest Point

Q3: Numerical Estimation

From Year1 to Year2, the population increases by ____ individuals

Process Sketch

Linear scale

Log scale

Q3: Numerical Estimation

From Year1 to Year2, the population increases by ____ individuals

Q3: Numerical Estimation

How many times more creatures are there in Year2 than Year1?

Process Sketch

Linear scale

Log scale

Q3: Numerical Estimation

How many times more creatures are there in Year2 than Year1?

Q3: Numerical Estimation

How many times more creatures are there in Year2 than Year1?

Q3: Numerical Estimation

How many times more creatures are there in Year2 than Year1?

Q3: Numerical Estimation

How long does it take for the population in Year 1 to double?

Process Sketch

Linear scale

Log scale

Q3: Numerical Estimation

How long does it take for the population in Year 1 to double?

Challenges & Benefits of Multimodal Graphical Testing

Challenges & Benefits

  1. Conflicting results can be hard to reconcile

  2. Conducting multiple studies is multiple times the work
    (multiple times the payoff?)

  3. Greater insight into the tradeoffs of design decisions

Challenges & Benefits

  • Testing method needs to be matched to level of engagement

  • Need to examine graphical choices across levels of engagement

Packages

References

Bajgier, Steve M., Maryanne Atkinson, and Victor R. Prybutok. 1989. “Visual Fits in the Teaching of Regression Concepts.” The American Statistician 43 (4): 229–34. https://doi.org/10.1080/00031305.1989.10475664.
Buja, Andreas, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung Lee, Deborah F Swayne, and Hadley Wickham. 2009. “Statistical Inference for Exploratory Data Analysis and Model Diagnostics.” Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 367 (1906): 4361–83. https://doi.org/10.1098/rsta.2009.0120.
Cooke, L. 2010. “Assessing Concurrent Think-Aloud Protocol as a Usability Test Method: A Technical Communication Approach.” IEEE Transactions on Professional Communication 53 (3): 202–15. https://doi.org/10.1109/TPC.2010.2052859.
Croxton, F. E., and H. Stein. 1932. “Graphic Comparisons by Bars, Squares, Circles, and Cubes.” Journal of the American Statistical Association 27 (177): 54–60. https://doi.org/10.1080/01621459.1932.10503227.
Croxton, F. E., and R. E. Stryker. 1927. “Bar Charts Versus Circle Diagrams.” Journal of the American Statistical Association 22 (160): 473–82. https://doi.org/10.2307/2276829.
Eells, W. C. 1926. “The Relative Merits of Circles and Bars for Representing Component Parts.” Journal of the American Statistical Association 21 (154): 119–32. https://doi.org/10.1080/01621459.1926.10502165.
Gegenfurtner, Andreas, Erno Lehtinen, and Roger Säljö. 2011. “Expertise Differences in the Comprehension of Visualizations: A Meta-Analysis of Eye-Tracking Research in Professional Domains.” Educational Psychology Review 23 (4): 523–52. https://doi.org/10.1007/s10648-011-9174-7.
Goldberg, Joseph H., and Jonathan I. Helfman. 2010. “Comparing Information Graphics: A Critical Look at Eye Tracking.” In Proceedings of the 3rd BELIV’10 Workshop on BEyond Time and Errors: Novel evaLuation Methods for Information Visualization - BELIV ’10, 71–78. Atlanta, Georgia: ACM Press. https://doi.org/10.1145/2110192.2110203.
Goldberg, Joseph, and Jonathan Helfman. 2011. “Eye Tracking for Visualization Evaluation: Reading Values on Linear Versus Radial Graphs.” Information Visualization 10 (3): 182–95. https://doi.org/10.1177/1473871611406623.
Guan, Zhiwei, Shirley Lee, Elisabeth Cuddihy, and Judith Ramey. 2006. “The Validity of the Stimulated Retrospective Think-Aloud Method as Measured by Eye Tracking.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI ’06, 1253. Montréal, Québec, Canada: ACM Press. https://doi.org/10.1145/1124772.1124961.
Hegarty, Mary, Harvey S. Smallman, and Andrew T. Stull. 2012. “Choosing and Using Geospatial Displays: Effects of Design on Performance and Metacognition.” Journal of Experimental Psychology: Applied 18 (1): 1–17. https://doi.org/10.1037/a0026625.supp.
Hughes, B. M. 2001. “Just Noticeable Differences in 2d and 3d Bar Charts: A Psychophysical Analysis of Chart Readability.” Perceptual and Motor Skills 92 (2): 495–503.
Liu, Chan, Hao Liu, and Zhanglu Tan. 2023. “Choosing Optimal Means of Knowledge Visualization Based on Eye Tracking for Online Education.” Education and Information Technologies, May. https://doi.org/10.1007/s10639-023-11815-4.
Loy, Adam, and Heike Hofmann. 2013. “Diagnostic Tools for Hierarchical Linear Models.” Wiley Interdisciplinary Reviews: Computational Statistics 5 (1): 48–61. https://doi.org/10.1002/wics.1238.
Lu, Min, Joel Lanir, Chufeng Wang, Yucong Yao, Wen Zhang, Oliver Deussen, and Hui Huang. 2022. “Modeling Just Noticeable Differences in Charts.” IEEE Transactions on Visualization and Computer Graphics 28 (1): 718–26. https://doi.org/10.1109/TVCG.2021.3114874.
Majumder, Mahbubul, Heike Hofmann, and Dianne Cook. 2013. “Validation of Visual Statistical Inference, Applied to Linear Models.” Journal of the American Statistical Association 108 (503): 942–56. https://doi.org/10.1080/01621459.2013.808157.
Mosteller, Frederick, Andrew F. Siegel, Edward Trapido, and Cleo Youtz. 1981. “Eye Fitting Straight Lines.” The American Statistician 35 (3): 150–52. https://doi.org/10.1080/00031305.1981.10479335.
Netzel, Rudolf, Jenny Vuong, Ulrich Engelke, Seán O’Donoghue, Daniel Weiskopf, and Julian Heinrich. 2017. “Comparative Eye-Tracking Evaluation of Scatterplots and Parallel Coordinates.” Visual Informatics 1 (2): 118–31. https://doi.org/10.1016/j.visinf.2017.11.001.
Robinson, Emily A., Reka Howard, and Susan Vanderplas. 2022. “Eye Fitting Straight Lines in the Modern Era.” Journal of Computational and Graphical Statistics 0 (0): 1–8. https://doi.org/10.1080/10618600.2022.2140668.
———. 2023a. “Perception and Cognitive Implications of Logarithmic Scales for Exponentially Increasing Data: Perceptual Sensitivity Tested with Statistical Lineups.” Journal of Computational and Graphical Statistics Under Review. https://earobinson95.github.io/logarithmic-lineups/logarithmic-lineups-revisions.pdf.
———. 2023b. ‘You Draw It’: Implementation of Visually Fitted Trends with R2d3.” Journal of Data Science 21 (2): 281–94. https://doi.org/10.6339/22-JDS1083.
VanderPlas, S, R C Goluch, and H Hofmann. 2019. “Framed! Reproducing and Revisiting 150-Year-Old Charts.” Journal of Computational and Graphical Statistics 28 (3): 620–34. https://doi.org/10.1080/10618600.2018.1562937.
Vanderplas, S, and H Hofmann. 2017. “Clusters Beat Trend⁉ Testing Feature Hierarchy in Statistical Graphics.” Journal of Computational and Graphical Statistics 26 (2): 231–42. https://doi.org/10.1080/10618600.2016.1209116.
VanderPlas, S, C Röttger, D Cook, and H Hofmann. 2021. “Statistical Significance Calculations for Scenarios in Visual Inference.” Stat 10 (1). https://doi.org/10.1002/sta4.337.
von Huhn, R. 1927. “Further Studies in the Graphic Use of Circles and Bars.” Journal of the American Statistical Association 22 (157): 31–36. https://doi.org/10.1080/01621459.1927.10502938.
Woller-Carter, Margo M., Yasmina Okan, Edward T. Cokely, and Rocio Garcia-Retamero. 2012. “Communicating and Distorting Risks with Graphs: An Eye-Tracking Study.” Proceedings of the Human Factors and Ergonomics Society Annual Meeting 56 (1): 1723–27. https://doi.org/10.1177/1071181312561345.
Xiong, Cindy, Cristina R. Ceja, Casimir J. H. Ludwig, and Steven Franconeri. 2020. “Biased Average Position Estimates in Line and Bar Graphs: Underestimation, Overestimation, and Perceptual Pull.” IEEE Transactions on Visualization and Computer Graphics 26 (1): 301–10. https://doi.org/10.1109/TVCG.2019.2934400.
Zhao, Yifan, Dianne Cook, Heike Hofmann, Mahbubul Majumder, and Niladri Roy Chowdhury. 2013. “Mind Reading: Using an Eye-Tracker to See How People Are Looking at Lineups.” International Journal of Intelligent Technologies & Applied Statistics 6 (4): 393–413. https://doi.org/10.6148/IJITAS.2013.0604.05.

Questions?