time | topic |
---|---|
3:00-3:20 | What is your plot testing? |
3:20-3:35 | Creating null samples |
3:35-4:00 | Conducting a lineup test |
4:00-4:30 | Testing for best plot design |
time | topic |
---|---|
3:00-3:20 | What is your plot testing? |
3:20-3:35 | Creating null samples |
3:35-4:00 | Conducting a lineup test |
4:00-4:30 | Testing for best plot design |
What do you see?
✗ non-linearity
✓ heteroskedasticity
✗ outliers/anomalies
✓ non-normality
✗ fitted value distribution is uniform
Are you sure?
What do you see?
There a difference between the groups
✓ location
✗ shape
✓ outliers/anomalies
Are you sure?
What is the null hypothesis?
There is no relationship between residuals and fitted values. This is \(H_o\).
Alternative hypothesis, \(H_a\):
There is some relationship, which might be
What is being tested in each of these plot descriptions?
Distribution of VAR1 is ?
There is no relationship between VAR1 and VAR2. More specifically, the proportion of VAR2 in each level of VAR1 is the same.
There is no relationship between VAR1 and VAR2. Particularly, VAR2 is not dependent on VAR1 and there is no trend.
Sampling distribution for a t-statistic. Values expected assuming \(H_o\) is true. Shaded areas indicate extreme values.
For making comparisons when plotting, draw a number of null samples, and plot them with the same script in the plot description.
\(H_o\): There is no relationship between residuals and fitted values.
How would you generate null samples?
Break any association by
set.seed(241)
ggplot(lineup(null_permute("species"), penguins, n=15),
aes(x=flipper_length_mm,
y=bill_length_mm,
color=species)) +
geom_point(alpha=0.8) +
facet_wrap(~.sample, ncol=5) +
scale_color_discrete_divergingx(palette="Zissou 1") +
theme(legend.position = "none",
axis.title = element_blank(),
axis.text = element_blank(),
panel.grid.major = element_blank())
If 10 people are shown this lineup and all 10 pick plot 2, which is the data plot, the \(p\)-value will be 0.
Generally, we can compute the probability that the data plot is chosen by \(x\) out of \(K\) observers, shown a lineup of \(m\) plots, using a simulation approach that extends from a binomial distribution, with \(p=1/m\).
This means we would reject \(H_o\) and conclude that there is a difference in the distribution of bill length and flipper length between the species of penguins.
data(wasps)
set.seed(258)
wasps_l <- lineup(null_permute("Group"), wasps[,-1], n=15)
wasps_l <- wasps_l %>%
mutate(LD1 = NA, LD2 = NA)
for (i in unique(wasps_l$.sample)) {
x <- filter(wasps_l, .sample == i)
xlda <- MASS::lda(Group~., data=x[,1:42])
xp <- MASS:::predict.lda(xlda, x, dimen=2)$x
wasps_l$LD1[wasps_l$.sample == i] <- xp[,1]
wasps_l$LD2[wasps_l$.sample == i] <- xp[,2]
}
ggplot(wasps_l,
aes(x=LD1,
y=LD2,
color=Group)) +
geom_point(alpha=0.8) +
facet_wrap(~.sample, ncol=5) +
scale_color_discrete_divergingx(palette="Zissou 1") +
theme(legend.position = "none",
axis.title = element_blank(),
axis.text = element_blank(),
panel.grid.major = element_blank())
If 10 people are shown this lineup and 1 picked the data plot (position 6), which is the data plot, the \(p\)-value will be large.
This means we would NOT reject \(H_o\) and conclude that there is NO difference in the distribution of groups.
Experiment 1 (n=10)
Suppose \(x\) out of \(n\) people detected the data plot from a lineup, then the visual inference p-value is given as \(P(X \geq x)\) where \(X \sim B(n, 1/m)\), but
the assumption of independence is not strictly satisfied, if people are shown the same lineup. So the \(p\)-value is computed by simulation with
and the power of a lineup is estimated as \(x/n\). We’ll use this to compare the signal strengths for different plot designs. Stay tuned!
Which plot is the most different?
Plot description was:
In particular, the researcher is interested to know if star temperature is a skewed distribution.
\(H_o: X\sim exp(\widehat{\lambda})\)
\(H_a:\) it has a different distribution.
Which row of plots is the most different?
Plot description was:
\(H_o:\) Proportion of males and females is the same for each year, conditional on age group
\(H_a:\) it’s not
nullabor
packageTake a moment to look at the lineup
function documentation. Run the sample code to make a lineup, eg:
or your own.
And then the different null sample generating functions: null_permute
, null_lm
, null_ts
, null_dist
.
No peeking!
Which plot is the most different?
Which plot is the most different?
This is the pair of plot designs we are evaluating.
Compute signal strength:
No peeking!
Which plot is the most different?
Which plot is the most different?
This is the pair of plot designs we are evaluating. Comparing the positions at which passes were made by both teams.
Compute signal strength:
For the star temperature data, where we used this plot design
create a lineup with a different design, that you think might reveal the data distribution as different from null samples, better than the density plot. Possibilities could include geom_histogram
, ggbeeswarm::quasirandom
, lvplot::geom_lv
.
Do you see any clusters here?
My colleague does, but I don’t.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.