Suppose we find that the probability of a successful programme outcome (out) depends on treatment (treat) and mediator (med) as per the Bayes network depicted in part 1 of the figure below. Suppose also that there are no other unmeasured variables. This model defines \(P(\mathit{out} | \mathit{treat}, \mathit{med})\), \(P(\mathit{med} | \mathit{treat})\), and \(P(\mathit{treat})\). The arrows denote these probabilistic relationships.
Interpreting the arrows as causal relations, then all six models above are consistent with the conditional probabilities. Model 2 says that treatment and outcome are associated with each other because the mediator is a common cause. Model 3 says that the outcome causes treatment assignment. Model 4 says that the treatment causes mediator and outcome; however, outcome causes mediator. And so on. These six models are all members of the same Markov equivalence class (see Verma & Pearl, 1990).
We need something beyond the data and statistical assocations to distinguish between them: theory. Some of the theory might be trivial, e.g., that the outcome followed treatment and can’t have caused the treatment because we have ruled out time travel.
References
Verma, T., & Pearl, J. (1990). Equivalence and synthesis of causal models. Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, 255–270.