Public Service Reminder: Correlation is not Causation

Smart people are not especially prone to confusing correlation with causation because they are careless with evidence. They fall into the trap for a more interesting reason: the human mind is exquisitely tuned to detect patterns and to explain them. When two variables move together in a stable way, the brain does not experience this as a neutral observation. It experiences it as a problem demanding resolution. Something must be connecting these things. Once that question arises, the mind does what it always does—it supplies an answer.

Causal explanations are particularly seductive because they take the form of stories. A correlation merely states that two things vary together; a causal account explains why. The latter feels complete in a way the former does not. Humans are not comfortable leaving relationships unexplained, and “they just co-occur” rarely feels like a satisfying endpoint. As a result, the presence of a correlation creates a vacuum that narrative quickly fills, often long before alternative explanations have been seriously considered.

One reason this happens so reliably is that confounding variables are usually invisible. When people see two associated variables, they instinctively reason as if those variables exist in isolation. The possibility that both are being driven by a third factor—season, population size, illness severity, socioeconomic context—does not announce itself. It has to be actively sought. Without deliberate effort, the mind defaults to a simple two-variable world, even when reality is plainly more complicated.

Reverse causation adds another layer of difficulty. The idea that A causes B fits comfortably with everyday intuition. The idea that B might be causing A, or that both might be downstream effects of something else entirely, is cognitively awkward. It requires slowing down and suspending the initial narrative impulse. In practice, many causal claims rest not on evidence that the proposed direction is correct, but on the fact that it feels natural.

Large datasets and clean statistical results can amplify the problem. A strong correlation, a smooth graph, or a strikingly small p-value creates an aura of authority. The rigor of the mathematics is quietly misattributed to the interpretation. Statistical strength begins to stand in for causal proof, even though the two are conceptually unrelated. The result is an overconfidence that is not warranted by the data.

Ironically, expertise does not reliably protect against this error and can sometimes worsen it. Experts are better at inventing mechanisms, and once a plausible mechanism can be imagined, skepticism often relaxes. The story sounds right, fits existing knowledge, and aligns with professional intuitions. At that point, the correlation no longer feels like a hypothesis-generating observation; it feels like confirmation, even if the proposed mechanism has never been directly tested.

This is why causal claims built on correlation should trigger disciplined discomfort rather than immediate assent. A genuine causal relationship requires more than co-movement. It requires a defensible mechanism, serious attention to confounders, careful consideration of directionality, and evidence that the relationship persists when baseline risk or severity is accounted for. It also requires remembering that group-level associations often fail when projected onto individuals.

Correlation is not meaningless. It is often the first sign that something interesting is happening. But it answers only a narrow question: do these variables change together? The harder question—what, if anything, is causing what—lies downstream. Confusing the two is not a rookie mistake. It is a deeply human one.

Thank you for commenting.