Surprisal!

S-values are a neat idea for helping to think about – maybe even feel – the meaning of p-values. They are described by Rafi and Greenland (2020). This post explains the basic idea, beginning with flips of a coin.

  • Suppose you flip an apparently fair coin and the outcome is heads. How surprised would you feel?
  • Now flip it twice: both outcomes are heads. How surprised would you feel now?
  • Flip the coin 40 times. All of the outcomes are heads. How surprised are you now?

I suspect your level of surprise has gone from something like “meh” through to “surely this isn’t a fair coin?!”

S-values (the S is for surprisal) provide a way to think about p-values in terms of how likely it would be to get a sequence of all heads from a number of flips of a fair coin. That number of flips is the s-value and can be calculated from the p-value.

Here is an example of a coin flipped three times. There are \(2^3 = 8\) possible outcomes, listed in the table below:

First coin flip Second coin flip Third coin flip
H H H
H H T
H T H
H T T
T H H
T H T
T T H
T T T

If the coin is fair, then the probability of each of these outcomes is \(\frac{1}{8}\). In particular, the probability of all heads is also \(\frac{1}{8}\), or 0.125.

More generally, the probability of getting all heads from \(n\) fair coin flips is \(\frac{1}{2^n}\). Here is a table showing some examples:

Flips Probability all heads
1 0.5
2 0.25
3 0.125
4 0.0625
5 0.03125
6 0.01562
7 0.00781
8 0.00391
9 0.00195
10 0.00098

Now here is the connection with p-values. Suppose you run a statistical test and get \(p = 0.03125\); that’s the same probability as that of obtaining five heads in a row from five coin tosses. The s-value is 5.

Or suppose you merely got \(p = 0.5\). That’s the probability of obtaining heads after one flip of the coin. The s-value is 1.

The larger the s-value, the more surprised you would be if the coin were fair.

To convert p-values to s-values, we want to find an \(s\), such that

\(\displaystyle \frac{1}{2^s} = p\).

The log function (base 2) does this for us:

\(\displaystyle s = -\log_2(p)\).

What about the traditional (or notorious?) 0.05 level?

\(-\log_2(0.05) = 4.32\),

to two decimal places. So, that’s the same as getting all heads when you flip a coin 4.32 times – which isn’t entirely intuitive when expressed as coin flips. But you could think of it being a little more surprising than getting four heads in a row if you flipped a fair coin four times.

References

Rafi, Z., Greenland, S. Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Med Res Methodol 20, 244 (2020).