Croissant Charts: Modulating the Perform

Croissant Charts: Modulating the Performance of Normal Distribution Visualizations with Affordances
License: CC BY 4.0
arXiv:2604.04432v1 [cs.HC] 06 Apr 2026
\SpecialIssueSubmission
\CGFccby
\BibtexOrBiblatex
\electronicVersion
\PrintedOrElectronic
Croissant Charts: Modulating the Performance of Normal Distribution Visualizations with Affordances
Submission 1004
R. Fygenson
\orcid
0000-0002-0705-9000
E. Bertini
1,2
\orcid
0000-0002-9932-0551
and L. M. Padilla
\orcid
0000-0001-9251-5279
Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
College of Arts, Media and Design, Northeastern University, Boston, MA, USA
Abstract
Affordances, originating in psychology, describe how an object’s design influences the physical and cognitive actions users may take.
Past work applied affordance theory to visualization to explain how design decisions can impact the cognitive actions of visualization readers.
In this work, we demonstrate that affordances can complement effectiveness
rankings
by further explaining the root causes behind visualizations’ task performance. To do so, we conduct a case study on
static normal
probability density function plots, identifying their current affordances. Next, we
identify
the optimal affordances for a common probability-comparison task and develop a novel affordance-driven visualization
, the Croissant Chart, to support them.
We empirically validate the design’s effectiveness through a preregistered study (
= 808), demonstrating how affordances can inform predictable changes in task performance. Our findings underscore the potential for affordance-based approaches to enhance visualization effectiveness and inform future design decisions.
{CCSXML}

10003120.10003145.10011769
Human-centered computing Empirical studies in visualization
500

10003120.10003145.10011770
Human-centered computing Visualization design and evaluation methods
300

10003120.10003145.10011768
Human-centered computing Visualization theory, concepts and paradigms
300

\ccsdesc
[500]Human-centered computing Empirical studies in visualization
\ccsdesc
[300]Human-centered computing Visualization design and evaluation methods
\ccsdesc
[300]Human-centered computing Visualization theory, concepts and paradigms
\printccsdesc
Introduction
Affordance
, rooted in psychology, refers to the potential actions—both physical and mental—that users may take based on the design of an object
hartson-1999-user-actionframework
gibson-ecological-vis-perception
Kaptelinin_aff_encyclo
In the context of visualizations,
mental actions
are the cognitive operations readers perform when interpreting a visualization, such as comparing values, grouping elements, or estimating trends.
Cognitive
affordances
, or the relationship between the design of an object and the knowledge that is imparted upon the object’s user
Hartson-2003
, can improve understanding of
visualization
design implications by complementing task precision evaluations
fygenson-cog-affordance-framework
. Because cognitive affordance offers an alternate lens of understanding how visualization designs impact mental actions, it can also
suggest specific mental actions that
readers
are likely to use
. In this paper, we demonstrate cognitive affordances
’ utility by using them to
connect visualization designs to readers’ mental actions.
This paper illustrates how
affordance-based visualization design can
improve
task accuracy
and
reveal the root cause of differences in task performance.
Although, affordances are well-studied in psychology and human
computer interaction (HCI) (for a discussion, see
Kaptelinin_aff_encyclo
), they have only recently received attention in visualization research. Fygenson et al. translate affordance theory to visualization, characterizing cognitive affordances for visualization, summarizing related constructs, and proposing a framework to scaffold future research and discussion
fygenson-cog-affordance-framework
. We extend this work by applying cognitive affordances to visualization design and evaluation, laying the groundwork for future research.
To validate
cognitive
affordances’ utility, we demonstrate how they can support reasoning about likely causes of interpretation errors
, such as
mathematically
incorrect
mental actions
, and isolate specific elements of visualizations that are problematic.
We then use
affordances to reason about design decisions that are likely to cue
ideal
mental actions
for a task. We investigate
affordances’
utility through a case study of
static (i.e., non-interactive)
probability density function plots (PDFs). Prior research has identified systematic errors associated with
static
PDF interpretation
fygenson-padilla-pdf-scaling
, yet established methods for optimizing visualization design, such as the rankings of expressiveness and effectiveness
munzner2015visualization
, have not been systematically applied to assess or improve PDFs. Moreover, established methods have not been used to assess alternative visualizations that are known to improve accuracy in probability-related tasks, such as quantile dot plots (QDPs)
kay-qdps
fernandes2018-qdp-cdf
. QDPs’ superior performance indicates that alternative affordances may better support probabilistic reasoning than PDFs. Through an affordance-based analysis of
these non-interactive
visualizations, we systematically examine how
cognitive
affordances shape
probability distribution
interpretation, offering the following key contributions:
an analysis of PDF visualizations, demonstrating how an affordance
can be used to redesign PDFs for a specific task
hypotheses about the underlying causes of
a known
reasoning error with PDFs
fygenson-padilla-pdf-scaling
and predictions as to why these effects will not extend to QDPs
an affordance-based evaluation of cases where QDPs may
lead to
interpretative challenges
a preregistered, empirical evaluation of the effectiveness of our redesigned PDFs, traditional PDFs, and QDPs
that
link
performance differences to our affordance-driven hypotheses
These contributions advance the understanding of how
cognitive
affordances shape probabilistic reasoning.
Background
2.1
Cognitive affordances in visualization
Theoretical foundation
Ecological psychologist J.J. Gibson first introduced affordances in the late 1970s to define the relationship between an object and its user in terms of action possibilities (i.e., ways in which an individual could use an object)
gibson-ecological-vis-perception
. Over the following decades, researchers in HCI and psychology adopted affordances to describe how the shapes of physical items imply their possible and intended uses
norman-psych-of-everyday-things
norman-1999
, and later, how digital interfaces imply possible actions (e.g., clicking a button)
Kaptelinin_aff_encyclo
Visualization research has investigated the affordances of user interactions, studying visual cues that incite users to hover over chart elements to reveal more information
boy-interactive-vis-aff
, and examining how users interact with dynamic physical bar charts
taher2015physicalaffbarchart
. However, this work does not apply to non-interactive charts, which comprise much visualized information
boy-interactive-vis-aff
A recent subset of work investigates the non-physical affordances of visualizations, exploring how design impacts the cognitive actions of visualization readers, and thus the information that they are most likely to glean. Fygenson et al. provide an overview of this work, and name these non-physical affordances
cognitive affordances
fygenson-cog-affordance-framework
They define cognitive affordances in visualization similarly to Norman’s affordance definition (i.e., all possible actions that a visualization’s design enables a reader to take)
norman-psych-of-everyday-things
, and specify that cognitive affordances are: (1) neither binary nor mutually exclusive, and (2) a factor of both a visualization’s design and its reader.
Examples of cognitive actions that can be afforded include taking away messages about, and interpreting characteristics of, visualized data.
They
also
hypothesize that affordance-based evaluations can provide stronger logical reasoning for uncovering the root cause of misleading designs and suggesting successful alternatives.
In this paper, we present an
initial validation of
their
hypothesis.
Investigative methods.
Within visualization, the investigation of affordances, or affordance-related concepts, has been supported via several methods (for a review, see
fygenson-cog-affordance-framework
). The most popular method
is a free-response question in which participants are shown a visualization and asked
open-ended questions about what they see (e.g., “describe in a sentence
what is shown”)
quadri-doyouseewhatisee
carswell-spontaneous-interpretation
shah-graph-comprehension
zacks-tversky-bars-lines
. This method can be modified by scoping such questions to specific tasks
xiong-afford-comparison
, or by restricting codes of interest when thematically coding collected responses
shah-1995-comprehending-line-graphs
. In our pilot studies (Sec.
3.1
) and affordance evaluation (Sec.
3.5
and
3.8
), we employ the free-response technique because of its exploratory nature and conventional coding.
Effectiveness rankings.
Visualization design decisions and pedagogy are often supported by
frameworks
that rank marks and channels by their ability to precisely communicate encoded data
munzner2015visualization
ware-info-vis-book
heer-bostock-repr
cleveland_mcgill_2012
cleveland-mcgills-shape-param-graphs
isenberg-sys-review-vis-techniques
. Extensions of precision-focused work investigate effectiveness from similar performance-based lenses, including accurate recall
borkin-beyond-memorability
, and minimal response time
livingston-eval-multivar-vis
isenberg-sys-review-vis-techniques
. Effectiveness
rankings
are useful in evaluating common charts (e.g., bar and pie charts) and have been translated into automatic chart recommendation systems
mackinlay-auto-design
besher-feiner-autovisual
. Effectiveness
rankings describe
the encodings that most precisely support a task, but offer little guidance as to the tasks that unprompted readers are most likely to complete (i.e., cognitive affordances). Past work has found that task effectiveness can correlate with visualizations’ cognitive fit (i.e., how well a visualization corresponds to readers’ mental representations of information)
vessey-cog-fit
vessey-cog-fit-empirical-study
, suggesting a relationship between readers’
mental actions
, visualizations’ designs, and effectiveness.
The logical scaffolding of affordance-based investigations sets them apart from traditional performance evaluations by providing alternative diagnostic power. Response time and accuracy evaluations can illuminate a visualization’s poor performance and support its redesign through frameworks such as the effectiveness and expressiveness rankings
munzner2015visualization
. Although these frameworks are useful, their guidance does not always increase visualizations’ interpretability
bertini-all-chart-not-scatterplot
, or consistently change the information that readers are likely to glean. Cognitive affordances, on the other hand, can provide logical conclusions about the underlying cause of correct or incorrect interpretations and, as we show in this paper, motivate redesigns that correlate with improved performance.
Affording
mental actions
Low-level tasks, such as comparing the lengths of two shapes or reading an axis label, contribute to higher-level takeaways and decision-making
bongshin-lee-task-taxonomy
padilla2018decision
By describing the probable
lower-level mental actions of
visualization readers, cognitive affordances can shed light on the driving factors behind readers’ interpretations.
For example, past research has investigated how bar chart affordances can inform the likelihood of comparative takeaways
fygenson2023affordances
xiong-afford-comparison
xiong-grouping-cues
, finding
bar
placement can significantly change the comparisons and higher-level takeaways that readers report. Afforded cognitive actions can also drive readers’
mental actions
when they encounter an unfamiliar graph. For example, if a reader sees a PDF for the first time, its continuous edge and aesthetic similarity to a line chart
(see Fig.
, top)
may afford extracting height information
as one would do when reading a line chart
Fig.
, row a and b
). In turn, the reader might adopt a height-focused strategy and characterize the PDF by its maximum height.
2.2
Probability Density Visualizations
Communicating probabilities is a common requirement when disseminating experimental results
padilla2022know
, and probability distributions can be visualized in numerous ways
padilla_review2022
. Box plots, and confidence, standard deviation, and histogram intervals display key statistical properties of distributions through visual representations of summary statistics. Despite their long-standing popularity in scientific communication for both experts and the general public
van2019communicating
Ross-intro-stats
, these visualization methods do not display the entire shape of a distribution, leading to a loss of statistical detail
correll2014error
. Additionally, prior research finds that interval visualizations and box plots induce errors in which readers incorrectly assume that values inside visualized boundaries are more probable than those just outside the boundaries, thereby misinterpreting encoded
probability
correll2014error
joslyn2021
. Visualizations that fully encode distributions’ shape instead of only showing key statistical metrics can lead to more correct conclusions
greis-uncertainty
Many visualizations that display distribution shape, such as PDFs, violin, raincloud, and ridgeline plots, encode probability as the area under a curve. However, unlike other Cartesian-coordinate plots, these visualizations’ y-axis values do not readily communicate useful information
Ross-intro-stats
. Instead, extracting cumulative probabilities (i.e., the probability of a random variable being above or below a threshold or between two bounding thresholds) from these charts requires finding the relevant slice of the area under the curve and calculating the percentage of total area under the curve that the slice occupies. This calculation can be challenging, especially when a distribution is not uniform and the areas in question are nonpolygonal
fygenson-padilla-pdf-scaling
Modern visualization designs, such as quantile dot plots (QDPs)
kay-qdps
and hypothetical outcome plots (HOPs)
hullman-hops
, seek to maintain the same communication of distribution shape as area-encoded visualizations, while also increasing audience understanding through the use of visual frequency framing
(i.e.,
expressing probabilities as natural frequencies). This method of communicating odds, such as “6 out of 100 times” rather than “6%”, can improve comprehension of
odds
in textual contexts
cosmides-tooby-frequency-framing
. Frequency framing has inspired the discrete communication of probability as counts of marks instead of continuous areas
in QDPs and HOPs
fernandes2018-qdp-cdf
hullman-hops
Members of the general public
are more likely to correctly interpret frequency-framed visualizations than PDFs in specific contexts
kay-qdps
fernandes2018-qdp-cdf
hullman-hops
kale2020-qdp
Although they communicate more information than interval visualizations
padilla_review2022
, these discretized visualizations still sacrifice detail in exchange for facilitating human reasoning. HOPs, for example, rely on animation, making them unsuitable for nondigital formats and requiring more time to extract precise information
kay-qdps
. QDPs trade some statistical resolution by using discrete dots to encode continuous distributions. This limitation is particularly relevant in low (20)-quantile QDPs, which prior studies suggest are more effective than higher (100)-quantile versions
kay-qdps
Equal-height scaling vs equal-area scaling.
Statistical plotting software typically defaults to generating equal-area PDFs, such that the area under each PDF curve occupies the same number of pixels. Because each of these areas equals 1 (i.e., 100% probability), equal-area scaling ensures all PDFs have the same probability-to-pixel ratio, allowing simple area judgments to facilitate cumulative probability comparisons (see Fig.
, row c). However, for area-encoded distribution visualizations (e.g., PDFs, violin plots, ridgelines), equal-area scaling can cause occlusion when some distributions are much narrower (i.e., have lower standard deviation) than others
(see Fig.
, a)
. Spacing PDFs to accommodate tall distributions can use substantial vertical space
(Fig.
, b) which is often restrictive to authors working within page limits, so visualization authors may choose to scale PDFs to the same height to save space (Fig.
, c).
This equal-height option is not mathematically incorrect
and is popular enough to be supported by
statistical plotting software (e.g., STAN bayesplot
bayesplot-r
, ggdist
ggdist
). Still, past research establishes that equal-height PDFs lead to more incorrect judgments than equal-area PDFs, for
members of the general public
making simple cumulative probability comparisons
fygenson-padilla-pdf-scaling
Figure 1:
Top: The task we test. Bottom: PDFs’ afforded
mental actions
from our pilot study. Some actions yield correct results for equal-area PDFs (left), but not for equal-height PDFs (right).
See Section
3.1
for more detail.
Notably, the impact of equal-height scaling on other probability distribution visualizations remains unexplored. However, we hypothesize that visualizations relying on frequency framing may be more robust to scaling as they encourage
mental actions
independent of mark height or area. Additionally, visualizations that inherently maintain constant height, such as interval plots and box plots, fall outside this discussion, as they naturally have equal heights.
Figure 2:
Why would someone scale PDFs to equal-heights? (a) different variances can cause overlap and occlusion (b) separating them requires more space (c) equal-height scaling yields a compact, statistically correct figure.
Study Materials and Methods
Scaling pairs of PDF plots to equal heights decreases
members of the general public
’s ability to compare the distributions’ cumulative probability below a provided threshold (see Fig.
fygenson-padilla-pdf-scaling
. Motivated by these results, we examine
the utility of affordance theory
, as discussed in Section
2.1
, in
evaluating equal-height PDFs and designing alternate visualizations
To do so, we proceeded in four stages. First, we analyzed pilot free-response data and relevant literature to identify possible cognitive affordances of PDFs and related distribution visualizations (Section
3.1
). We assessed these cognitive affordances’ alignment with the cumulative probability comparison task we investigate in our experiment. Second, we defined ideal affordances for the task and hypothesized which design features may better support them (Section
3.2
). Third, we developed a novel, affordance-informed visualization to more strongly cue correct mental actions (Section
3.2
). Finally, we preregistered and conducted an empirical study to test whether our predicted affordances correspond to measurable differences in task performance (Sections
3.3
through
3.8
).
3.1
Affordances of probability density visualizations
Pilot Data.
We collected pilot data to hypothesize about the affordances of different probability density visualizations by asking participants who completed simple tasks with normal probability visualizations
, such as comparing the odds of two distributions,
to “please describe in as much detail as possible how [they] interpreted the graph and what strategies [they] used to answer the questions.” We conducted these pilot studies via Prolific.com with participants
who had Prolific approval rate over 90%, resided in the U.S., were fluent in English, and over the age of 18 (291 female, 292 male, 14 non-binary/third gender, 4 did not report gender). These responses were between 5 and 278 words, with a median word length of 27 words.
These pilot studies allowed us to refine the measures for our main study and gain insights into afforded strategies for comparing distributions.
We recruited 601 participants for the pilot study. Participants were randomly assigned to view either
normal PDFs (
= 300) or normal QDPs (
= 301).
We analyzed responses by thematically coding them to identify recurring concepts associated with each visualization. To do so, (1) an author reviewed all responses, (2) responses were
manually
grouped based on
conceptual
similar
in the reported
sentiments, and (3) overarching themes were identified within these groupings.
Affordances of PDFs.
To illustrate how affordance evaluations can support visualization redesigns, we selected a visualization-task use case that has been shown to lead to errors.
Members of the general public
are consistently worse at comparing cumulative probabilities when using equal-height PDFs (see Fig.
fygenson-padilla-pdf-scaling
. We investigated this visualization-task combination because it presents a clear judgment error, involves a sufficiently complex visualization to warrant redesign, and has alternative visualizations (e.g., QDPs) that have demonstrated high performance in other tasks. By focusing on a specific scaling issue, we provide a case study that exhibits the process of using affordances and their generative potential.
Our thematic coding of pilot data identified several
mental actions
that likely led to incorrect calculations.
Some participants
reported comparing the heights of the normal PDFs, either at their mean (see Fig.
row a) or at the threshold of a cumulative probability (row b).
Height-comparing
could be driven by PDFs’ visual similarity to line charts
(see Fig.
top).
Although this
action
reflects an incorrect understanding of PDFs, it is worth noting that when PDFs are scaled to equal
areas
, height comparison can still result in correct comparisons (Fig.
row a, left). When PDFs are scaled to equal
heights
(Fig.
row a, right), however, height comparison no longer ensures accurate cumulative probability comparisons.
Participants’ also compared partial areas that correlated with the threshold
(Fig.
row c).
Area-comparing
can be afforded by participants’ interpretation of the PDF as an area chart, and although closer to a “correct” mathematical understanding of PDFs, also only yields correct judgments when the charts are scaled to equal areas.
Lastly, participants compared overall PDF spread (Fig.
row d) and slope (row e). Both
mental actions
commonly led to incorrect comparisons. Participants who used PDF spread reported wider graphs having higher probabilities for our task, even though the opposite is true. Participants who used slope comparisons also fell susceptible to incorrect interpretations, such as “… whichever curve was more steep was normally the one with less probability,” which is the opposite of the case. These
shape-comparing
actions
are robust to equal scaling; y-axis scaling does not change spread or significantly alter comparisons of PDF slope. Still, they led to incorrect conclusions. In summary, the first phase of our affordance evaluation identified three primary
mental actions
that contribute to reasoning errors with equal-height PDFs: height-comparing, area-comparing, and shape-comparing.
Figure 3:
QDPs’ afforded mental actions from our pilot study.
Affordances of QDPs.
A review of QDP pilot data revealed one predominant afforded strategy. Participants often reported
counting
and then comparing the number of dots on the right of the cumulative threshold for each QDP
(Fig.
, top)
This strategy is robust to equal-height scaling because height manipulations do not affect dot counts (see Fig.
for equal-height QDPs
). QDPs’ strong counting affordance is indicative of their frequency-framed motivation, as participants interpreted each dot as part of a greater whole (e.g., 5 dots out of 10 total).
Additionally, some participants mentioned confusion when quantile dots did not align with an x-axis value of interest, indicating that they perceived each dot as a single probability, as opposed to an approximated continuous distribution
(Fig.
. This discrete interpretation can lead to erroneous conclusions, especially when distributions have wide spreads that lead to large spaces between dots. This affordance aligns with past research on the nature of individual areas (e.g., bar charts) affording
readers to interpret data as discrete
and connected dots (e.g., line charts) affording
readers to interpret data as continuous
zacks-tversky-bars-lines
shah-graph-comprehension
Ideal affordances for equal-height PDFs.
After reviewing our pilot analysis, we identified ideal and undesired affordances from both PDFs and QDPs. Most significantly, QDPs afford counting, which makes them robust to equal-height scaling. PDFs lack this affordance, which we hypothesize contributes to incorrect judgments when scaled to equal heights (see Fig.
). However, QDPs also afford distributional discreteness much more than PDFs, which can lead to incorrect comparisons when two distributions share similar standard deviations (see Fig.
). Thus, we identified that, for our cumulative probability task, an ideal visualization would afford both counting marks as parts of a whole and continuity.
Next, we
we constrained our design exploration to established static visualization techniques that have been empirically shown to support part-to-whole reasoning among members of the general public. Contemporary literature on part-to-whole and probability communication has strong empirical support for two approaches: icon arrays and treemaps. Icon arrays (i.e., frequency-framed displays) have been widely shown to improve probability comprehension by visually representing both the reference class and affected subset, thereby supporting accurate part–to-whole reasoning and reducing denominator neglect, especially among less numerate individuals (for reviews, see
ancker2006design
garcia2017designing
). Additionally, empirical work on rectangular treemaps suggests that spatial subdivision techniques can support accurate part–to-whole judgments. Although treemaps encode values using area, controlled experiments demonstrate judgment accuracy comparable to hierarchical bar charts at certain data densities and, in some cases, faster comparisons
kong2010perceptual
Based on these prior works,
we identified two design choices that we hypothesized afforded counting parts of a whole:
1.
an “icon array” technique in which colored glyphs are used to signal some items out of a larger set, as is done with QDPs, and
2.
a “spatial subdivision” technique in which a shape is divided by thin lines into smaller shapes, as in tree maps.
Of these two techniques, only the latter has the potential to afford continuity. Currently, this spatial subdivision technique communicates probability distributions through standard deviation or histogram intervals by encoding quantiles or standard deviations as abutting, equal-height bars
yang2023swaying
fernandes2018-qdp-cdf
. Although these charts are robust to vertical scaling and communicate a continuous distribution across a range of values, interval visualizations are susceptible to reasoning errors (for reviews see
joslyn2021
padilla_review2022
) and are less “expressive” than alternative visualizations, which more readily communicate details of outliers and distribution shape
padilla_review2022
. Thus, we concluded that spatial subdivision is currently employed in charts that do not show the distributional information we desire.
3.2
An affordance-motivated redesign
Above, we established our goal of affording 1) counting dots as parts of a whole, 2) continuity, and 3) a general distributional shape to maintain expressiveness. Below, we present an affordance-informed redesign of PDFs:
Croissant charts
(see Fig.
).
Figure 4:
Croissant charts’ intended affordances. PDF shape for expressiveness. Minimal slices with gaps afford continuity and parts of a whole. Dots afford counting and equality across slices.
Affording counting parts of a whole.
To afford counting strategies, we used two approaches: spatial subdivision and marks that encourage counting. Subdivision cuts single shapes into multiple parts to communicate that the parts originate from one entity. Inspired by QDPs, we slice the PDF into quantiles, such that each slice represents an equal probability
kay-qdps
. However, we only display vertical slices to prevent the misinterpretation that horizontal slices of area under a PDF’s curve lead to logical conclusions.
Additionally, we place a single dot in each quantile slice, in case QDPs’ dots afford counting due to their shape. We also hypothesize that dots help mitigate the strong area, height, or width strategies we see in our pilot data on PDFs. We hypothesize that these dots correctly afford equal probabilities across each slice
as a visual extrapolation of the equiprobability bias (i.e., people assume equal probabilities across items)
gauvrit-eqprob
. We also use the dots to indicate equal probability in the charts’ legend for added clarity. We darken the border of each section and each dot to increase their saliency at smaller scales.
Affording a continuous distribution.
To maintain perceptual continuity, we minimize area removal when subdividing the PDF. This decision draws on Gestalt principles of closure
wagemans-gestalt-perception
, and preserves as much of the PDF shape as possible. However, we include slight padding around each slice to ensure visual separation.
Affording general distributional shape.
To maintain the expressiveness with which PDFs display distributional shapes, we retain their density function curve rather than making each quantile the same height as in interval plots, or adopting a discretized distribution shape as in QDPs. We observed in pilot data that dots placed in QDPs’ tails can skew readers’ perception of a distribution’s continuity and sometimes afford multimodality. Additionally, past research shows that bin width and dot size can influence the shape of QDPs
correll-prob-density2019
. By preserving the continuous shape of PDFs, we aimed for greater resilience against these distortions.
3.3
Investigative Questions and Hypotheses
We preregistered three main hypotheses about differences in reader performance when using PDFs, QDPs, and croissant charts on OSF. This preregistration
and a visualization of our hypotheses for easier reference are available in Supplemental Materials
The first hypothesis is that
(H1) equal-height PDFs will result in a lower likelihood of correct cumulative comparisons than equal-area PDFs
. This hypothesis is based on results from previous work
fygenson-padilla-pdf-scaling
, and its confirmation would replicate prior findings while checking for methodological validity.
Our second
set of
preregistered hypotheses examines whether visualizations that afford counting can mitigate the performance decline observed when PDFs are scaled to equal heights.
We preregistered four hypotheses concerning counting-affording designs.
Performance hypotheses:
We hypothesized that
(H2a)
equal-height Croissant-10s,
(H2b)
equal-height Croissant-20s, and
(H2c)
equal-height QDPs (20 quantiles)
would each result in a higher probability of correct cumulative comparisons than equal-height PDFs.
Robustness-to-scaling hypothesis:
We further hypothesized that
(H2d)
QDPs, Croissant-10s, and Croissant-20s would exhibit a smaller decrease in correctness when moving from equal-area to equal-height scaling than PDFs.
Our last experimental hypothesis is motivated by our theorizing that QDPs only weakly afford a continuous distribution. We preregistered that
(H3) QDPs’ relationship between standard deviation (SD) pair comparisons and correctness will differ from other visualizations’ relationship between SD and correctness, specifically in the 4.5 vs. 5 SD condition
. In other words, QDPs’ performance would drastically change in the 4.5 v 5 SD condition in a way that other visualizations’ performance would not. We rationalized that when two distributions’ SDs are similar, their QDPs’ dots will be stacked similarly and moved only slightly along the x-axis in ways that may be too subtle for readers to recognize.
3.4
Stimuli
When comparing of cumulative probabilities of equal-area PDFs, readers can rely on heuristics, such as PDF height, to make accurate judgments. For equal-height PDFs, however, height comparison can lead to incorrect conclusions; other strategies, such as counting, offer a more robust approach. Following our affordance analysis of PDFs, we tested three types of visualizations: PDFs, which our pilot data suggest strongly afford continuity but only weakly afford counting, QDPs; which our pilot data suggest strongly afford counting but weakly afford distribution continuity; and croissant charts, which we designed to afford both continuity and counting.
We test QDPs with 20 quantiles because they have been shown to result in better performance than their 100-quantile counterparts
kay-qdps
. We test croissant charts with 20 quantiles (Croissant-20s), as mathematical equivalents to our QDP stimuli, and croissant charts with 10 quantiles (Croissant-10s). Our pilot data showed that Croissant-10s outperform croissants with five quantiles, so we wanted to study how Croissant-10s compare to Croissant-20s.
For all four graphs (PDFs, QDPs, Croissant-20s, and Croissant-10s), we created stimuli with two normal probability distributions stacked vertically (as shown in Fig.
, top). These stimuli were then scaled to either “equal area” or “equal height” (see Fig.
, left panel). For PDFs and both croissants “equal area” meant that the area below the two curves had the same number of pixels, and “equal height” meant that the distance from the highest point on the curve to the x-axis was held constant. For QDPs, “equal area” ensured that both graphs’ dots were the same pixel size, while “equal height” maintained a constant distance from the top of the highest dot to the x-axis. In narrower distributions, this adjustment reduced dots’ diameter to preserve their circular shape.
For each of our
eight
stimuli types, we created four sets of distributions with varying SDs (see Fig.
, right panel). The first set presented a distribution with an SD of 2 and a distribution with an SD of 5. The rest compared distributions with SDs of 3 and 5, 4 and 5, and finally 4.5 and 5. We chose the first three of these SD comparisons because they were shown to have varying performance in previous work
fygenson-padilla-pdf-scaling
. We added the 4.5 vs. 5 SD comparison, because we believed it could highlight QDPs’ weaker affordance of a continuous distribution, as explained in H3 in Section
3.3
. For each of the SD combinations we created two sets of stimuli, such that the vertical position of the narrower (higher SD) distribution was counterbalanced (see Fig.
, bottom right). This resulted in eight stimuli of each scaled visualization, for a total of 64 tested stimuli. We created all stimuli via custom D3.js functions, and edited (e.g., scaled, relabeled, etc.) them in Adobe Illustrator. Stimuli and generating scripts are available in Supplemental Materials.
Figure 5:
Stimuli overview. Left: Between-subjects; different visualization techniques (PDF, QDP, Croissant-10, Croissant-20) under two scaling conditions (Equal Area, Equal Height). Right: within-subjects; standard deviation pairs and position variations.
3.5
Experimental Design
We designed a 4 (
Visualization
) × 2 (
Scaling
) × 4 (
SD Pairs
) × 2 (
Position
) mixed-subjects design
(see Figure
).
The between-subjects variables were
Visualization
and
Scaling
. The within-subject variable of interest was
SD Pairs
. We included
Position
to counterbalance our stimuli and increase statistical power. We did not hypothesize about
Position
’s impact, and included it as a covariate in our analysis.
We investigated task correctness and collected participants’ qualitative responses to observe afforded strategies.
3.6
Participants
We recruited U.S.-based participants over 18 through Prolific
prolific
, which anonymizes, qualifies, enrolls, and compensates participants, enabling a double-blind study.
As per our preregistered power analysis, we recruited 808 participants (
= 101 per between-subject group). Because affordances depend on readers’ individual characteristics
fygenson-cog-affordance-framework
, we crowdsourced responses from people with varying backgrounds. To account for variance in visualization education, we collected participants’ graph literacy scores using the Short Graph Literacy test
okan-sgl-2019
3.7
Procedure
Participants were informed of the estimated duration and compensation (12 USD/hour) and consented to an IRB-approved form, after which they completed the study on Qualtrics
qualtrics
. First, participants read instructions and were informed of a scenario in which scientists are measuring the concentration of two different solutes in seawater. This scenario was identical to that used in previous PDF comparison work
fygenson-padilla-pdf-scaling
, and adapted from a study on HOPs
hullman-hops
. Then, participants
were asked to answer
an initial attention check
question to confirm they were reading the instructions. Next,
they were instructed to make the size of their browser window as large as possible and shown an annotated illustration of an example plot. Next, participants answered eight questions, one for each pair of charts. Following previously published procedures
fygenson-padilla-pdf-scaling
, all the charts were presented in randomized order, and the questions read “Which solute, if either, has a
higher probability
of being present at
X or less ppm
in the sampled seawater?” where X was in the same position across all stimuli (at threshold
in Figure
), but varied depending on the numbering of each stimulus’s x-axis. Participants chose from three options (“solute A has a higher probability”, “solute B has a higher probability,” and “neither solute A nor solute B has a higher probability”). The correct option was always the solute with a lower SD. This task does not represent the full range of probability distribution use cases, but allows us to investigate the extent to which affordances can impact strategies such that specific task performance is modulated.
After completing the chart comparison questions, participants used an open-ended text response to report their thought processes while reading the charts. Finally, participants completed Okan et al.’s Short Graph Literacy test
okan-sgl-2019
. The survey and its stimuli are available in Supplemental Materials.
3.8
Analysis
We preregistered the binomial Bayesian model in
Equation 1
with uninformative priors centered at 0 and a standard deviation of 2.5. We used R packages
tidyr
v. 1.3.1 for data processing
r-tidyr
brms
v. 2.20.4 for Bayesian modeling
r-brms
, and
tidybayes
v. 3.0.6 for data processing and visualization
r-tidybayes
As denoted in Eq.
, our model assesses the variance in participants’ binary correctness as explained by the interaction between
Visualization
and
Scaling
, and the interaction between
Visualization
and
SD Pair
, as well as their lower-order terms.
In this model,
p_{i}=P(Y_{i}=1)
represents the posterior probability that participant
makes a correct comparison. Because we use a logit link,
logit
log
\text{logit}(p_{i})=\log\!\left(\frac{p_{i}}{1-p_{i}}\right)
, all fixed-effect coefficients are estimated on the log-odds scale.
Variable levels are described in Sec.
3.5
. We also account for any variance explained by
Position
and
Graph Literacy
\displaystyle Y_{i}
Bernoulli
\displaystyle\sim\text{Bernoulli}(p_{i})
(1)
logit
\displaystyle\text{logit}(p_{i})
Visualization
Scaling
\displaystyle=\beta_{0}+\beta_{1}\text{Visualization}_{i}+\beta_{2}\text{Scaling}_{i}
Visualization
Scaling
SDPair
\displaystyle\quad+\beta_{3}(\text{Visualization}_{i}\times\text{Scaling}_{i})+\beta_{4}\text{SDPair}_{i}
Visualization
SDPair
Position
\displaystyle\quad+\beta_{5}(\text{Visualization}_{i}\times\text{SDPair}_{i})+\beta_{6}\text{Position}_{i}
GraphLiteracy
\displaystyle\quad+\beta_{7}\text{GraphLiteracy}_{i}+u_{ID[i]}
We also conducted an exploratory analysis of each visualization’s affordances. Two raters qualitatively coded participants’ descriptions of their thought processes when completing the cumulative probability task. We then calculated the intraclass correlation (ICC) for each code, and the raters reconciled any codes that had an ICC 95% confidence interval with a lower bound less than 0.6.
Figure 6:
Posteriors of Eq.
show that QDPs are robust to equal-height scaling and perform best across most conditions except the 4.5 v 5 SD comparison, where they perform the worst. Equal-height croissant charts outperform equal-height PDFs. Distributions show posterior predicted probabilities of correct responses, transformed from log-odds for interpretability. Black dots and lines show mean predicted probabilities and 95% credible intervals.
Experimental Results
Below, we describe the results from our analysis of binary task correctness, the results from our exploratory analysis of affordances, and the relationship between observed affordance and task correctness. As in our preregistered justification, we recruited 808 participants (
= 101 for each between-subject group). 438 participants self-reported to not be men, with a mean age of 44.0 years (
SD
= 15.4). The mean short graph literacy score was 2.3 of 4 (
SD
= 1.1).
Participants took an average of 11 minutes and 5 seconds (
SD
= 7 min, 2 sec) to complete the survey.
As preregistered, we computed our model with and without participants who failed the attention check and performed a sensitivity analysis. There were no meaningful differences in the results between the two samples. For a large sample size and more conservative results, we report posteriors of the model that uses
responses from participants who passed and those who failed the attention check
4.1
Task Performance: Investigating Hypotheses
We investigated the amount of variance in correct comparisons based on visualization, scaling, and SD pairs with the binomial model in Eq.
. As preregistered, we interpret interaction effects with credible intervals that exclude zero as accounting for meaningful variance in task correctness. We use the term “meaningful” instead of “statistically significant” to avoid implying a fixed decision threshold (e.g.,
< .05), as advised by Bayesian best practices
kruschke2021bayesianreporting
. We ran our model with each combination of possible referents.
All analyses are available in Supplemental Materials.
To examine H1, if equal-height
PDFs
lead to meaningfully fewer correct comparisons than equal-area
PDFs
, we can look to
Scaling
’s main effect when
Visualization
’s referent is the
PDF
condition (row 1 in Table
).
Scaling
’s credible interval does not include zero, indicating that scaling has a meaningful effect on
PDF
task performance. By visually examining the posteriors from this model (Fig.
), the results show that equal-height
PDFs
(colored yellow in rows with gray backgrounds) result in fewer correct answers than equal-area
PDFs
(yellow in rows with white backgrounds) across all SD Pairs. Thus, we accept H1, and present evidence for the reproducibility of results from Fygenson and Padilla
fygenson-padilla-pdf-scaling
To examine H2a-d, how equal-height visualizations that afford counting impact task correctness in comparison to equal-height
PDFs
, we can refer to rows 5-16 in Table
. These rows show the main effects of
Visualization
when the model is run with
PDF
as the referent. Equal-height
Croissant-20s
resulted in meaningfully improved task correctness in comparison to equal-height
PDFs
across all SD pairs (rows 9-12), leading us to accept H2b. Equal-height
QDPs
meaningfully improved task correctness compared to equal-height
PDFs
across all SD pairs (rows 5–7), except for the 4.5 vs. 5 pair, where
QDP
performance was meaningfully worse, as indicated by the sign change in the last three columns of row 8. These findings support the acceptance of H2c, contingent on the considerations outlined in H3. Equal-height
Croissant-10s
resulted in meaningfully improved task correctness only over equal-height
PDFs
for SD pairs 2 vs. 5 and 4.5 vs. 5 (rows 13 and 16), leading us to partially accept hypothesis H2a. These results are also evident in Figure
’s gray-background rows, which show
PDFs
consistently trailing behind both
Croissant-20s
and
QDPs
To directly investigate H2d–whether
QDPs
Croissant-20s
, and
Croissant-10s
reduce performance differences between equal-area and equal-height scaling more effectively than
PDFs
–we can examine the interaction effects of
Visualization
and
Scaling
in rows 17-19 of Table
. The results indicate that
QDPs
meaningfully decrease the expected drop in performance associated with scaling from equal-area to equal-height in comparison to
PDFs
(row 17). However,
Croissant-10s
and
Croissant-20s
show no meaningful difference in their interactions with
PDFs
(rows 18-19). We do not find evidence to support the hypothesis that
Croissant-10s
and
-20s
can mitigate the performance drop associated with equal-height
PDFs
, and thus only partially accept H2d.
Figure 7:
Equal-height Croissant-10s and QDPs with an SD of 4.5 (tops) and an SD of 5 (bottoms).
Results show that Croissant-10s afford inter-edge interpolation much more than QDPs.
Lastly, to assess whether
QDPs
exhibit different task performance when comparing the 4.5 vs. 5 SD pair relative to other SD pairs, and if this performance differs from other visualizations, we analyze the interaction between
SD Pair
and
Visualization
. Specifically, we examine meaningful differences when running the model with a
Visualization
referent of
QDP
and
SD Pair
referent of 4.5 vs. 5. Rows 20-28 in Table
show relevant model outputs and are all meaningfully different. A meaningful difference in QDP task performance across SD pairs is further supported by the last two rows of Figure
, where
QDPs
perform markedly worse than other charts. This drop in performance is notable because
QDPs
perform the best in all other conditions. These findings lead us to accept (H3) that
QDPs
’ relationship between task performance for 4.5 v 5 SD and any other SD pair is different from
PDFs
’,
Croissant-20s
’, and
Croissant-10s
’. To further analyze QDPs’ performance in the 4.5 v 5
SD Pair
condition, we examine the main effect of
Visualization
when the model’s referents are set to
QDP
, 4.5 vs. 5 SD, and first Equal Height, then Equal Area. Rows 32-37 in Table
show that
QDPs
perform meaningfully worse than
PDFs
Croissant-20s
, and
Croissant-10s
when comparing a distribution with SD of 4.5 to a distribution with SD of 5, regardless of scaling.
In summary, our analysis reveals that equal-height
PDFs
lead to a lower likelihood of task correctness than equal-area
PDFs
, equal-height
Croissant-20s
improved chances of correctness over equal-height
PDFs
, although not as well as equal-height
QDPs
QDPs
were especially robust to scaling, and frequently outperformed other visualizations. However,
QDPs
led to meaningfully lower chances of correctly comparing distributions with SDs of 4.5 and 5, than all other visualizations. These findings highlight the benefits of
Croissant-20s
and
QDPs
over
PDFs
, while also demonstrating a specific weakness of low-quantile
QDPs
Table 1:
Rows report posterior log-odds from Eq.
contrasted between experimental conditions with 95% Bayesian credible intervals. Darker cell backgrounds indicate stronger effects. l-CI and u-CI denote lower and upper bounds of credible intervals, respectively. Cr-20 = Croissant-20; Cr-10 = Croissant-10. Odds ratios are obtained by exponentiating the reported log-odds estimates.
Referent Conditions
Comparison
Est
imate
l-CI
u-CI
Row
Vis
SD Pair
Scaling
PDF
N/A
Eq. Height
Eq. Area
1.26
0.78
1.75
QDP
N/A
Eq. Height
Eq. Area
0.11
-0.37
0.61
Cr-20
N/A
Eq. Height
Eq. Area
0.86
0.38
1.34
Cr-10
N/A
Eq. Height
Eq. Area
0.82
0.35
1.30
PDF
2 v 5
Eq. Height
QDP
1.80
1.24
2.37
PDF
3 v 5
Eq. Height
QDP
2.29
1.71
2.88
PDF
4 v 5
Eq. Height
QDP
1.45
0.88
2.02
PDF
4.5 v 5
Eq. Height
QDP
-0.99
-1.65
-0.34
PDF
2 v 5
Eq. Height
Cr-20
0.88
0.32
1.45
10
PDF
3 v 5
Eq. Height
Cr-20
0.71
0.15
1.28
11
PDF
4 v 5
Eq. Height
Cr-20
0.77
0.21
1.33
12
PDF
4.5 v 5
Eq. Height
Cr-20
0.90
0.32
1.49
13
PDF
2 v 5
Eq. Height
Cr-10
0.69
0.13
1.25
14
PDF
3 v 5
Eq. Height
Cr-10
0.49
-0.06
1.05
15
PDF
4 v 5
Eq. Height
Cr-10
0.34
-0.22
0.89
16
PDF
4.5 v 5
Eq. Height
Cr-10
0.60
0.02
1.19
17
PDF
N/A
Eq. Height
QDP x Eq. Area
-1.14
-1.83
-0.46
18
PDF
N/A
Eq. Height
Cr-20 x Eq. Area
-0.40
-1.09
0.27
19
PDF
N/A
Eq. Height
Cr-10 x Eq. Area
-0.40
-1.08
0.27
20
QDP
4.5 v 5
N/A
PDF x 2 v 5
-2.61
-3.17
-2.05
21
QDP
4.5 v 5
N/A
PDF x 3 v 5
-3.08
-3.66
-2.49
22
QDP
4.5 v 5
N/A
PDF x 4 v 5
-2.24
-2.80
-1.69
23
QDP
4.5 v 5
N/A
Cr-20 x 2 v 5
-2.58
-3.14
-2.04
24
QDP
4.5 v 5
N/A
Cr-20 x 3 v 5
-3.24
-3.82
-2.66
25
QDP
4.5 v 5
N/A
Cr-20 x 4 v 5
-2.35
-2.89
-1.79
26
QDP
4.5 v 5
N/A
Cr-10 x 2 v 5
-2.47
-3.03
-1.92
27
QDP
4.5 v 5
N/A
Cr-10 x 3 v 5
-3.15
-3.74
-2.58
28
QDP
4.5 v 5
N/A
Cr-10 x 4 v 5
-2.48
-3.02
-1.94
29
QDP
4.5 v 5
N/A
2 v 5
3.86
3.43
4.31
30
QDP
4.5 v 5
N/A
3 v 5
4.73
4.27
5.21
31
QDP
4.5 v 5
N/A
4 v 5
3.43
3.02
3.87
32
QDP
4.5 v 5
Eq. Area
PDF
1.92
1.32
2.53
33
QDP
4.5 v 5
Eq. Area
Cr-20
2.39
1.79
3.00
34
QDP
4.5 v 5
Eq. Area
Cr-10
2.07
1.47
2.68
35
QDP
4.5 v 5
Eq. Height
PDF
0.78
0.15
1.40
36
QDP
4.5 v 5
Eq. Height
Cr-20
1.69
1.09
2.31
37
QDP
4.5 v 5
Eq. Height
Cr-10
1.39
0.78
2.00
4.2
Affordances Described in Open-Ended Responses
Two raters qualitatively coded strategies from participants’ open-ended responses. The raters also identified responses likely AI-generated when they referenced elements not present in the stimuli (e.g., labels, y-axes). Of the 808 responses,
21 were identified by both and excluded from coding. We include the codes of all other responses for a more conservative analysis. All qualitative coding data and analyses are available in the Supplemental Materials.
The
review of open-ended responses
identified six strategies (mean ICC = 0.84, SD = 0.08): counting, height comparison, area comparison, spread comparison, judgment of how far into a visual mark a threshold occurred (i.e., “inter-edge interpolation”, Fig.
), and comparison of only marks directly above the threshold (i.e., “directly over tick”). We show the percentage of participants who mentioned each strategy by
Visualization
condition in Figure
The results indicate that
QDPs
afforded counting (68% of participants) more than any other visualization. This
affordance aligns with initial pilot data and
QDPs
’ frequency-framing motivation
kay-qdps
Croissant-10s
(23%) and
-20s
(33%) resulted in roughly
one-third to one-fourth
the number of counting strategies as
QDPs
, indicating a present, but weaker affordance.
’ counting affordance was exhibited in responses, such as “… I looked at the number of 5% ‘blocks’ [that] were on one side on both charts.” We did not observe any evidence that
PDFs
afforded counting.
Figure
also shows that there was very little evidence that
QDPs
afforded multiple strategies, in contrast to the other three visualizations.
This pattern indicates that
and
PDFs
result in greater flexibility in strategy use than
QDPs
Figure 8:
Percent of each visualization that led to each strategy (columns) across tested visualizations (rows).
Strategies are not mutually exclusive so rows and columns may not equal 100%.
4.3
Task Performance and Observed Affordances
The results in Figure
show that
QDPs
strongly afforded counting strategies,
Croissant-20s
and
-10s
weakly afforded counting, and
PDFs
did not afford counting at all. When looking at the posteriors of Equation
in Figure
, we can see that the difference in the strength of visualizations’ counting affordances corresponds with their likelihood of leading to correct cumulative probability comparisons between three of the four equal-height conditions we tested. These conditions are 2 vs. 5, 3 vs. 5, and 4 vs. 5 SD pairs, and their relative performance is remarkably consistent across equal-height conditions, which indicates that affording counting strategies could improve cumulative comparisons.
Additionally, our quantitative results show that
QDPs
lead to the lowest likelihood of correct comparisons of distributions with SDs of 4.5 and 5. At the same time, our results exhibit that
QDPs
lack diversity in their afforded
mental actions
PDFs
Croissant-20s
and
-10s
, on the other hand, afford a variety of strategies. For the 4.5 vs. 5 SD Pair condition, our experimental findings illustrate that visualizations that afford noncounting strategies lead to a higher chance of correct comparisons.
Lastly, equal-height
PDFs
perform meaningfully worse than equal-area
PDFs
across all SD Pair conditions, and
PDFs
’ most popular strategy is height comparison. To reason about this decrease, we can refer to the difference in height comparisons’ efficacy across equal-area and equal-height
PDFs
Although comparing the peak height of two equal-area
PDFs
leads to consistently correct judgments of our tested task, peak height comparison between equal-height
PDFs
does not (Fig.
, row a).
Discussion and Limitations
We present an affordance-motivated analysis of PDFs, followed by empirical evidence of the performance and affordances of area-encoded probability distribution visualizations. We do so through a case study of PDFs, which restricts the scope of our findings in the interest of examining the relationship between performance and affordance. Below, we detail the concrete contributions of this work, posit potential extrapolations of our findings, and discuss the limitations of our scoping and methodology, before describing future avenues of affordance research in visualization.
5.1
Contributions to Affordance Research in Visualizations
We analyze the relationship between performance and affordance evidence from our study to offer insights about the drivers behind previously documented comparison errors in PDFs
fygenson-padilla-pdf-scaling
and performance improvements from QDPs
kay-qdps
fernandes2018-qdp-cdf
kale2020-qdp
. We found a strong correlation between counting and equal-height visualizations’ performance, indicating that counting supports cumulative probability comparisons.
We believe that
QDPs
’ and
’ counting affordance provides indirect evidence of frequency-framing at work; however, we caution that this conclusion is circumstantial, as participants often struggle to fully articulate their cognitive processes
Nisbett1977-NISTMT
Another example of affordance-performance alignment is seen in
Croissant-10s
outperforming
QDPs
in the 4.5 vs. 5 SD condition, and
Croissant-10s
’ stronger affordance of “inter-edge interpolation.” This
term
refers to participants counting both the whole units prior to a threshold, and the partial amount of a visual mark that is bisected by a threshold (see Fig.
). Inter-edge interpolation suggests that participants recognize that probability does not change in discrete steps along the boundaries of visual marks, but instead forms a continuous distribution. One of our goals
in designing
croissant charts was to afford inter-edge interpolation, because we hypothesized that
QDPs
weakly afford this interpolation, which can lead to errors when comparing similar SD pairs.
We also see that
Croissant-20s
outperform
QDPs
in the 4.5 vs. 5 SD condition. This may be because
Croissant-20s
show higher resolution along the x-axis than the
QDPs
we tested. Although both charts encode 20 quantiles (i.e., each visual mark equals 5% probability),
QDPs
stack quantiles such that moving from one column to the next can jump the cumulative probability by more than one quantile, affording discrete changes that may not exist. When designing
, we sliced them into vertical quantiles, as opposed to
QDPs
’ vertical and horizontal binning technique. A benefit of this design is afforded continuity and increased detail along the x-axis such that close SD comparisons are more supported. On the other hand,
QDPs
’ binning technique affords counting much more strongly than
’, leading to
QDPs
’ superior performance across less similar SD Pairs. Additionally, increasing
QDPs
’ quantiles would increase their resolution, possibly improving their performance in the 4.5 vs. 5 SD condition.
5.2
Limitations and Future Work
We present a case study of probability visualizations that express distribution shape, which is inherently limited in scope.
Additionally, we only investigate normal
static
distributions because our case study is built on previous findings with normal distributions
fygenson-padilla-pdf-scaling
, and because non-normal distributions are significantly more difficult to interpret
peterson1964mode
. Future work should examine skewed and multimodal visualizations to increase the usefulness of this research. We also restrict our experiment to investigate a single, common task. Investigating other tasks, such as readers’ accurate estimation of cumulative probabilities, and decision-making, would expand understanding of how a single affordance may correlate to a range of tasks.
Additionally, we used open-ended responses to uncover afforded strategies, which restricts our affordance findings to strategies that participants are conscious of and able to report. Research in psychology cautions against soley using participant-reported evidence, because of their lack of correlation to participant decision-making
Nisbett1977-NISTMT
. Still, this open-ended reporting is useful for generating hypotheses and corroborating the confirmatory behavioral findings.
Lastly, expanding the study of the relationship between affordance and performance to other types of visualization and other kinds of information is warranted. This expansion would not only broaden our understanding of other visualizations and how their designs drive reader conclusions, but can also be used to hypothesize about visual techniques that can be adopted for probability communication. For example, spatial subdivision in pie charts and tree maps was instrumental to our affordance-motivated PDF redesign.
Conclusion
We present a case study of cognitive affordances and performance in probability distribution visualizations and demonstrate a relationship between the two metrics. In turn, our findings suggest that cognitive affordances not only exist in visualizations, but also can inform hypotheses about their task performance. This work contributes to the growing body of affordance work in visualization, and establishes empirical evidence for the utility of affordances in reasoning about visualization performance and
mental actions
Acknowledgments
This work was funded in part by NSF Grant #2428149.
References
BETA