Monday, February 18, 2008

Cum hoc ergo propter hoc

There's a common error that's bugged me for a while. It's related to "post hoc ergo proter hoc" (after this, therefore because of this), and apparently is called a spurious relationship statistics.

Anyway, the argument that bugs me goes something like this:

A happens,
Later B happens;
they seem to be related, thus A leads to B.

Now, I see this all over the place.

I recently read an article that suggests that if people (particularly women) take AP calculus in high school, they're more likely to be engineers. This is true, but she was using this fact to encourage women to take AP Calc, in order to increase their chances of become engineers.

The problem with this argument, is that she ignores the (10 or) 11 years of schooling prior to taking AP calc. The reason women who take AP calc go onto be engineers more often is that women who take calc in high school are better at math than at other subjects-- this was the conclusion of her study. People who are better at math than other subjects go on to engineering related careers more often than those who don't.

Event A: Woman takes calc in high school.
Event B: Woman becomes an engineer.
Cause C: Woman is better at math than english.

Somehow, this cause which she argues is forgotten. If she encourages more women who are "well-rounded" students to take calculus, the relationship between C and A weakens; hence, the relationship between A and B weakens.

The general idea has always baffled me, especially since spurious relationships almost always come up as a way to rectify problems with the cause. How can you argue a causal relationship and then forget about the implications of the cause? That's what a causal relationship is!