Actual causes: two examples using the updated Halpern-Pearl definition

Halpern (2015) provides three variants of the Halpern-Pearl definitions of actual causation. I’m trying to get my head around the formalism, which is elegant, concise, and precise, but tedious to use in practice, so I wrote an R script to do the sums. This blog post is not self-contained – you will need to read the original paper for an introduction to the model. However, it works through two examples, which may help if you’re also struggling with the paper.

The second (“updated”) definition of an actual cause asserts that \(\vec{A} = \vec{a}\) is a cause of \(\varphi\) in \((M,\vec{u})\) iff the following conditions hold:

AC1 \((M,\vec{u}) \models (\vec{A} =\vec{a}) \land \varphi\).

This says, if \(\vec{A} = \vec{a}\) is an actual cause of \(\varphi\) then they both hold in the actual world, \((M,\vec{u})\). Note, for this condition, we are just having a look at the model and not doing anything to it.

AC2 There is a partition of the endogenous variables in \(M\) into \(\vec{Z} \supseteq \vec{X}\) and \(\vec{W}\) and there are settings \(\vec{x’}\) and \(\vec{w}\) such that

(a) \((M,\vec{u}) \models [ \vec{X} \leftarrow \vec{x’}, \vec{W} \leftarrow \vec{w}] \neg \varphi\).

So, we’re trying to show that undoing the cause, i.e., setting \(\vec{X}\) to \(\vec{x’} \ne \vec{x}\), prevents the effect. We are allowed to modify \(\vec{W}\) however we want to show this, whilst leaving \(\vec{Z}-\vec{X}\) free to do whatever the model tells these variables to do.

(b) If \((M,\vec{u}) \models \vec{Z} = \vec{z^{\star}}\), for some \(\vec{z^{\star}}\), then for all \(\vec{W’} \subseteq \vec{W}\) and \(\vec{Z’} \subseteq \vec{Z}-\vec{X}\),
\((M,\vec{u}) \models [ \vec{X} \leftarrow \vec{x}, \vec{W’} \leftarrow \vec{w’}, \vec{Z’} \leftarrow \vec{z^{\star}}] \varphi\).

This says, trigger the cause (unlike AC1, we aren’t just looking to see if it holds) and check whether it leads to the effect under all subsets of \(\vec{Z}\) (as per actual world) that aren’t \(\vec{X}\) and all subsets of the modified \(\vec{W}\) that we found for AC2(a). Note how we are setting \(\vec{Z}\) for those subsets, rather than just observing it.

AC3 There is no \(\vec{A’} \subset \vec{A}\) such that \(\vec{A’} = \vec{a’}\) satisfies AC1 and AC2.

This says, there’s no superfluous stuff in \(\vec{A}\). You taking a painkiller and waving a magic wand doesn’t cause your headache to disappear, under AC3, if the painkiller works without the wand.

Example 1: an (actual) actual cause

Let’s give it a go with an overdetermined scenario (lightly edited from Halpern) that Alice and Bob both lob bricks at a glasshouse and smash the glass. Define

\(\mathit{AliceThrow} = 1\)
\(\mathit{BobThrow} = 1\)
\(\mathit{GlassBreaks} = \mathit{max}(\mathit{AliceThrow},\mathit{BobThrow})\)

So, if either Alice or Bob (or both) hit the glasshouse, then the glass breaks. Strictly speaking, I should have setup one or more exogenous variables, \(\vec{u}\), that define the context and then defined \(\mathit{AliceThrow}\) and \(\mathit{BobThrow}\) in terms of \(\vec{u}\), but it works fine to skip that step as I have here since I’m holding \(\vec{u}\) constant anyway.

Is \(\mathit{AliceThrow} = 1\) an actual cause of \(\mathit{GlassBreaks} = 1\)?

AC1 holds since \((M,\vec{u}) \models \mathit{AliceThrow} = 1 \land \mathit{GlassBreaks} = 1\). The first conjunct comes directly from one of the model equations and none of the functions change it. Spelling out the second conjunct,

\(\mathit{GlassBreaks} = \mathit{max}(\mathit{AliceThrow},\mathit{BobThrow})\)
\(= \mathit{max}(1, 1)\)
\(= 1\)

For AC2, we need to find a partition of the endogenous variables such that AC2(a) and AC2(b) hold. Try \(\vec{Z} = \{ \mathit{AliceThrow}, \mathit{GlassBreaks} \}\) and \(\vec{W}= \{ \mathit{BobThrow} \}\).

AC2(a) holds since \((M,\vec{u}) \models [ \mathit{AliceThrow} \leftarrow 0, \mathit{BobThrow} \leftarrow 0] \mathit{GlassBreaks} = 0\).

For AC2(b), we begin with \(\vec{Z} = \{ \mathit{AliceThrow}, \mathit{GlassBreaks} \}\) and the settings as per the unchanged model, so

\((M,\vec{u}) \models \mathit{AliceThrow} = 1 \land \mathit{GlassBreaks} = 1\).

We need to check that for all \(\vec{W’} \subseteq \vec{W}\) and \(\vec{Z’} \subseteq \vec{Z}-\vec{X}\),
\((M,\vec{u}) \models [ \vec{X} \leftarrow \vec{x}, \vec{W’} \leftarrow \vec{w’}, \vec{Z’} \leftarrow \vec{z^{\star}}] \varphi\).

Here are the combinations and \(\varphi \equiv \mathit{GlassBreaks} = 1\) holds for all of them:

\((M,\vec{u}) \models [ \mathit{AliceThrow} \leftarrow 1, \mathit{GlassBreaks} \leftarrow 1, \mathit{BobThrow} \leftarrow 0 ] \varphi\)
\((M,\vec{u}) \models [ \mathit{AliceThrow} \leftarrow 1, \mathit{BobThrow} \leftarrow 0 ] \varphi\)
\((M,\vec{u}) \models [ \mathit{AliceThrow} \leftarrow 1, \mathit{GlassBreaks} \leftarrow 1 ] \varphi\)
\((M,\vec{u}) \models [ \mathit{AliceThrow} \leftarrow 1 ] \varphi \)

(The third was rather trivially true; however, as far as I understand, has to be checked given the definition.)

AC3 is easy since the cause only has one variable, so there’s nothing superfluous.

Example 2: not an actual cause

Now let’s try an example that isn’t an actual cause: the glass breaking causes Alice to throw the brick. It’s obviously false; however, it wasn’t clear to me exactly where it would fail until I worked through this…

AC1 holds since in the actual world, \(\mathit{GlassBreaks} = 1\) and \(\mathit{AliceThrow} = 1\) hold.

Examining the function defintions, they don’t provide a way to link \(\mathit{AliceThrow}\) to a change in \(\mathit{GlassBreaks}\), so the only apparent way to do so is through \(\vec{W}\). Therefore, use the partition \(\vec{W} = \{\mathit{AliceThrow}\}\) and \(\vec{Z} = \{\mathit{GlassBreaks}, \mathit{BobThrow}\}\).

Now for AC2(a), we can easily get \(\mathit{AliceThrow} = 0\) as required, since we can do what we like with \(\vec{W}\). It doesn’t help when we move onto AC2(b) since we have to hold \(\mathit{AliceThrow} = 0\), which is the negation of what we want. The same is the case for the other partition including \(\mathit{AliceThrow}\) in \(\vec{W}\), i.e., \(\vec{W} = \{ \mathit{AliceThrow}, \mathit{BobThrow} \}\).

So, the broken glass does not cause Alice to throw a brick. The setup we needed to get through AC2(a) set us up to fail AC2(b).

References

Halpern, J. Y. (2015). A Modification of the Halpern-Pearl Definition of Causality. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), 3022–3033.

See also this companion blog post.

Neat guide to {tidyverse} updates in R, by Mine Çetinkaya-Rundel

This guide is intended for people teaching tidyverse, but it’s relevant to anyone (e.g., me!) trying to keep up with developments.

Everytime I load {tidyverse}, it advises me to use {conflicted}. I didn’t realise it would be this useful:

library(conflicted)
    
penguins |>
  filter(species == "Adelie")
#> Error:
#> ! [conflicted] filter found in 2 packages.
#> Either pick the one you want with `::`:
#> • dplyr::filter
#> • stats::filter
#> Or declare a preference with `conflicts_prefer()`:
#> • `conflicts_prefer(dplyr::filter)`
#> • `conflicts_prefer(stats::filter)`

Another neat update: case_when now doesn’t need type-specific NAs like NA_character_ for default cases.

# now, optionally
df |>
  mutate(
    x = case_when(
       ~ "value 1",
       ~ "value 2",
       ~ "value 3",
      .default = NA
    )
  )

Lots of other tips.

Distinguishing between = and <-

‘Perhaps because of the use of = for assignment in FORTRAN […] assignment is often read as “x equals E“. This causes great confusion. The first author learned to distinguish between = and := while giving a lecture in Marktoberdorf, Germany, in 1975. At one point, he wrote “:=” on the board but pronounced it “equals”. Immediately, the voice of Edsger W. Dijkstra boomed from the back of the room: “becomes!”. After a disconcerted pause, the first author said, “Thank you; if I make the same mistake again, please let me know.”, and went on. Once more during the lecture the mistake was made, followed by a booming “becomes” and a “Thank you”. The first author has never made that mistake again! The second author, having received his undergraduate education at Cornell, has never experienced this difficulty.’

– David Gries and Fred B. Schneider (1993, p. 17, footnote 6) [A logical approach to discrete math. Springer.]

R lets you use both = and <- for assignment, FYI.

Little Miss and Mr Men name binariness

Pownall and Heflick (2023) investigated gender stereotypes in all 47 Mr Men and 34 Little Miss books. One of the studies asked (adult) participants to rate the masculinity/femininity of the character names on a scale from 1 (“entirely feminine”) to 5 (“entirely masculine”). The “Mr” and “Little Miss” were stripped off, so, e.g., participants were asked about the word “Quick”, “Princess”, “Greedy”.

I created a name binariness index for each mean rating, \(r\); the distance between the mean rating and 3 (the mid-point), as a proportion of the maximum that distance could be (2):

\(\displaystyle \frac{|r-3|}{2}\)

Here’s a plot of the names, sorted by name binariness. So good candidates for nonbinary characters would be Mx Quick or Mx Lucky. Alternatively, you could rightly reject the premise that any of them are gendered and go for Mx Princess.

References

Pownall, M., & Heflick, N. (2023). Mr. Active and Little Miss Passive? The Transmission and Existence of Gender Stereotypes in Children’s Books. Sex Roles.

Mediation analysis effects

Are you wondering how to extract estimates of the following estimands from linear models underlying mediation analyses…?

  • total average causal effect (ACE) / average treatment effect (ATE)
  • average direct effect (ADE)
  • average causal mediation effect (ACME)
  • proportion mediated effect

You might be interested in this example I put together, using {mediation} in R, showing the arithmetic!

Research Design in the Social Sciences

“This book introduces a new way of thinking about research designs in the social sciences. Our hope is that this approach will make it easier to develop and to share strong research designs.

“At the heart of our approach is the MIDA framework, in which a research design is characterized by four elements: a model, an inquiry, a data strategy, and an answer strategy. We have to understand each of the four on their own and also how they interrelate.”

Uses {DeclareDesign} in R.

Blair, G., Coppock, A., & Humphreys, M. (2023). Research Design in the Social Sciences: Declaration, Diagnosis, and Redesign. Princeton University Press.

Mermin’s (1981) variant of Bell’s theorem – in R

Entanglement is the weirdest feature of quantum mechanics. David Mermin (1981) provides an accessible introduction to experiments showing that local determinism doesn’t hold in the quantum world, simplifying Bell’s theorem and tests thereof. This knitted Markdown file shows the sums in R. It’s probably only going to make sense if you have been here before, but hadn’t got around to doing the sums yourself (that was me, before writing this today!).

Hypothesis testing for categorical predictors

Interesting update to {ggeffects}, by Daniel Lüdecke:

A reason to compute adjusted predictions (or estimated marginal means) is to help understanding the relationship between predictors and outcome of a regression model. In particular for more complex models, for example, complex interaction terms, it is often easier to understand the associations when looking at adjusted predictions instead of the raw table of regression coefficients.

The next step, which often follows this, is to see if there are statistically significant differences. These could be, for example, differences between groups, i.e. between the levels of categorical predictors or whether trends differ significantly from each other.

The ggeffects package provides a function, hypothesis_test(), which does exactly this: testing differences of adjusted predictions for statistical significance. This is usually called contrasts or (pairwise) comparisons. This vignette shows some examples how to use the hypothesis_test() function and how to test whether differences in predictions are statistically significant.

Read more.