## October 18, 2012

### Correlation does not prove causation, but correlation does prove correlation

It took a surprisingly long time for the modern statistical concept of correlation to emerge. It was implicit for a long time, but Francis Galton worked out the basics in 1888 (when, by the way, he was 66 years old).

So, it's not surprising that people aren't really good yet at thinking about correlation.

In recent years, the cliche "Correlation does not prove causation" has emerged as a staple of Internet discussions, which I guess is a good thing, although it often appears in the more questionable form "Correlation does not imply causation."

In truth, correlation suggests causation. If A correlates with B, then perhaps A causes B. Or maybe B causes A. Quite possibly, some C causes both A and B, or various combinations.

Now, it could be that a finding that A correlates with B to some extent might be just a coincidence due to a limited sample size. Fortunately, we have good statistical techniques for measuring that possibility, and we have the test of replication. Similarly, attempts at replication can help weed out apparent correlations cause by incompetence, fraud, unconscious bias, and the like, although they can never be ruled out completely.

But, keeping those caveats in mind, we can say (with only a reasonable degree of overstatement):

Correlation does prove correlation.

For example, illegal immigration is correlated with many important social measures. This doesn't prove that illegal immigration "causes" high or low high or low school achievement. As we've all heard, "Correlation does not prove causation!"

But, few have heard, "Correlation does prove correlation."

For example, it does prove (to the extent that anything can be proved)  that illegal immigration is correlated with low school achievement. Moreover, the children and grandchildren and great-grandchildren of illegal immigrants tend to have below average school performance.

Furthermore, these correlations have been around for as long as they've been measured.

Now, it is conceivable that these correlations will vanish tomorrow.  Thus, the insistence is widespread that the burden of proof must be on those pointing out the correlations to prove causality beyond any doubt.

But, shouldn't the burden of proof be on the people asserting that the correlations will vanish to come up with at least a prima facie theory of why that will happen?

hbd chick said...

"Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'."

from the hover box to this xkcd cartoon. (^_^)

Eric said...

But, shouldn't the burden of proof be on the people asserting that the correlations will vanish to come up with at least a prima facie theory of why that will happen?

Of course they will respond these correlations will vanish when society discards racist policies that systematically oppress and imprison brown people.

And there's no way you will convince them otherwise. Even if you could prove causation mathematically, you wouldn't move the debate one inch.

Anonymous said...

"correlation does prove correlation"

The area of math known as statistics is partially an attempt to answer if "correlation does prove correlation". Or is what you are observing a matter randomness. So correlation does not necessarily prove correlation.

For example the Correlation coefficient is an attempt to measure the amount of correlation and the amount of confidence in that correlation.

http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient

http://en.wikipedia.org/wiki/Confidence_interval

Anonymous said...

Correlation does prove correlation and certainly implies causation.

My husband is an actuary and he says this all the time. He agrees that correlation doesn't prove causation. I just shows you where to look and where not. Hey, wait a minute. Isn't that the premise of the GSS, to show researchers good places to look? Yeah, well people do know that correlation implies causation but for proof you will have to investigate to find the mechanism of action.

Anonymous said...

Of course they will respond these correlations will vanish when society discards racist policies that systematically oppress and imprison brown people.

Remind us of these policies.

bulbasaur said...

Ilkka Kokkarinen summarized it well: correlation demands an explanation.

Anonymous said...

"So, it's not surprising that people aren't really good yet at thinking about correlation."

I think they are pretty good, it's just that they inject their wishes into their reasoning.

For example, if presented with a correlation between smoking and lung cancer, most liberals are capable of reaching the reasonable conclusion that smoking probably causes lung cancer.

Anonymous said...

Proofs are for math and logic. There are no proofs in science. Science is nothing but correlations and looking at the weight of evidence and likelihoods. The people who aggressively assert that "correlation does not prove causation" just don't want certain correlations attracting attention.

Mr X said...

Assistant Village Idiot said...

This seems a matter of falling out one side of the boat to avoid the other. When "Correlation does not imply causation" is used it is often a shorthand for "...and that's not a strong correlation, and your sample size is too small, and your followup conclusions are a reach."

Anonymous said...

The other argument you hear with regard to correlations is something along the lines of: "my age in years correlates with the distance between galaxies - what does that prove?" So, yeah, two things that are both increasing over time might both just be independently increasing over time - but if you're looking at data that is varying over some dimension other than time, it's hard to find such trivial explanations.

Anonymous said...

Capitalists and statists figured it's better to collude than collide.

Anonymous said...

"But, shouldn't the burden of proof be on the people asserting that the correlations will vanish to come up with at least a prima facie theory of why that will happen?"

There is no 'should'. There is only 'will' or 'will not'. And the powers-that-be will not.

Anonymous said...

Anonymous said...

Corruption prevents correction.

Ex Submarine Officer said...

Science is nothing but correlations and looking at the weight of evidence and likelihoods.

So, you're not a scientist, I'm guessing.

Science is examining empirical evidence, some of which may be correlations, may be other phenomena. Then there is this thing called a "theory" (or a hypothesis in earlier stages), this is what scientists use to explain things.

They make a theory that fits the evidence/data/phenomena that they've observed. Then they see what the theory predicts in a situation they haven't observed and is perhaps somewhat different than the data they have.

They then setup this situation, this is called an "experiment", and they try to predict the results of it using their "theory". It isn't like the experiments we did as kids with drugs, girls, or IED's.

If the "experiment" goes as the "theory" predicts, this is generally regarded as proving or at least supporting the experiment.

It is more or less impossible to completely prove a theory is completely true. What is possible is to prove that it is wrong in some regard and needs to be discarded or modified.

All of the above is "the scientific method", in case you are interested.

x said...

the line 'correlation does not prove causation' as an argument almost always invariably precedes 'but instead here is a correlation to explain the pheneomeon under discussion which is more agreeable to my sensibilities that i will state is the real cause'

take for example race and crime, race and intelligence, IQ and social class, etc.

Eric said...

Remind us of these policies.

As if I had any idea what they're talking about. Since I first applied to college decades ago all I've seen is women and minorities getting extra points and do-overs at every opportunity.

I'll grant you, that seems racist and sexist to me, but I don't think that's what they mean.

Anonymous said...

This reminds me of the Stoics' view on how prophecies should be worded. Typically it followed the form, "If you were born under the Dogstar, you will not die at sea." Chrysippus didn't like that since it implied causation. His reformulation was "Not both: you were born under the Dogstar and you will die at sea."

Anonymous said...

Christopher Briggs said...

All of the above is "the scientific method", in case you are interested.

Your description can be summed up as finding correlations between observed phenomena and theory-predicted outcomes. That then being the scientific method, as you say, "science is nothing but correlations" is well-supported.

Anonymous said...

Cail Corishev said...

I'd say the normal mode of human thought is to assume that correlation implies causation. Take diet, for instance: that's how we got "cholesterol causes heart disease." There was never any real evidence for that; only some correlation between some societies eating more cholesterol and having more heart attacks (and only if you cherry-pick the right societies as Ancel Keys did). No one bothered to consider that heart failure might cause the body to produce more cholesterol (since cholesterol is used to rebuild damaged heart tissue), or that some third factor might increase both independently.

People have to be trained to ignore correlation, or to only see it in accepted directions. There's a correlation between poverty and crime, but 99% of people assume poverty causes crime without ever considering the possibility that crime causes poverty or that something else causes both.

Billions of dollars are spent on education in the schools and media to make sure we only observe the right correlations and draw the right conclusions from them.

Anonymous said...

Do donations correlate with the quality or quantity of blogging on iSteve?

Chicago said...

The burden of proof should be on those making assertions. But they will never try to prove anything as their stock-in-trade is simply to assert things whilst holding the megaphone. That is because they are fundamentally dishonest to begin with. It's Propaganda 101 ; loudly assert things over and over again in the form of easy to remember slogans. Do not engage in genuine dialog but simply sloganeer until it becomes a jingle that people can't get out of their heads.

Dutch Boy said...

"Correlation does not prove causation" is the favorite argument of those who approve of present circumstances and intend to keep them that way (e.g., by failing to investigate the implications of correlations).

pat said...

Realtors have seized upon the correlation that used to obtain between home ownership status and public schools test scores, to assert that the way to get your kids to have better grades is to buy a house. I'm not making this up up. Indeed how could anyone make up such a thing?

Wait a minute, that was just what Barney Frank et al asserted that brought down the whole world's economy.

Danegerous stuff correlations.

Albertosaurus

Anonymous said...

Like the man said, correlation is the formal 'back-up' that is needed to appear to be a respectable non-chralatan when one has a strong 'hunch' that an observab;e phenomenom is linked to a particular variable.
The first thing however is to develop your 'hunch' or notion beforehand, - this can ony come from acute observation and experience - you need to know what you are looking for and where it can be found, it's all Sherlock Holmes.
A famous case concerned England in the 19th century. Therewas acholera epidemic in London, which killed thousands. Until that point cholera was unknown in England, the disease was confined to and endemic to India (as an aside Indians have a natural immunity to cholera, geneticall programmed through millenia of exposure to faecal contaminated water).
Various theories concerning cholera were expounded by medical men, the favorite was a 'miasma' of noxious gas.
Dr John Snow, suspected it was waterborne. In a classic example of its type, he plotted on a map cholera fatalities against the location of water wells. The correlation was indisputable.
As a matter of fact, the cholera bacillus was not identified and ecognised until many decades after Dr. Snow's correlation.
The upshot is that if the link between the hunch, belief, prejudice, call it what you will, is shown by unbiased an indisputable means to actually exist, then it does exist. A hidden third factor is unlikely if the variables are so tight and specific.

Anonymous said...

I would only agree with the even weaker statement that correlation suggests correlation.

That is, past correlation suggests future correlation.

Anonymous said...

Ilkka Kokkarinen summarized it well: correlation demands an explanation.

That is just standard actuarial science.

Anonymous said...

Correlation implies causation. A lot of feminists blame single motherhood/out-of-wedlock births that concerns working class and middle-class girls on abstinance programs but in reality contraception and abortion are widely available and used.

A huge majority of working class women are like Snooki. They party and they have sex. Without discrimination. The problem is that they cannot handle this freedom like an upper-class feminist white liberal woman can.

How? They're not intelligent. They have low IQ. They're not future time oriented. They're this and that...

Poor low IQ chicks are not really known for their impulse control. They have access to contraception from Walmart alright. They just can't use contraception properly as a high-status white feminist woman can.

They need authority. They need force. They need some supervision. Some push.

The problem is that the twin denials of female responsibility and consequences (It's MY choice! MY BODY! ME! ME! ME!) and denial of racial differences, plus intelligence, is catastrophic.

Lower class to middle class women in turn adopt the feminism of the upper-classes, but because they can't deal with it as well as others, they fail.

They don't possess privilege. They're not smart. Etc.

Feminists need to shut it when it comes to lower-class women.

Lower to average IQ women need that pesky evil reviled theocracy.

Anonymous said...

"But, shouldn't the burden of proof be on the people asserting that the correlations will vanish to come up with at least a prima facie theory of why that will happen?" - If someone is blindly asserting that correlation is not causation then they really can't take the argument any further than that.

Cyprian Korzeniowski said...

Anon @ 2:18 PM - The lower classes are always going to have a messier time of things, but poison's poison. And be she the prolest of proles, or the SWPLiest of SWPLs, she's still a woman.

panjoomby said...

i saw miss north dakota (who's black or half black) say she was "passionate about diversity." i assume she means she's passionate about "variability" - which you want a lot of in order to get decent correlations.

Anonymous said...

Anonymous said...

My Bosses Daughter loves my Boss; And my Boss Loves [different love]me; So my Bosses Daughter loves me.

Anonymous said...

Correlation does not imply causation.

Construct a countermodel:

Drinking a lot and vomiting are positively correlated. Correlation is symmetric: drinking a lot correlates to vomiting and vomiting correlation to drinking a lot. So, if correlation implies causation, we could say "vomiting causes drinking a lot." Premises true, conclusion false.

QED.

Anonymous said...

Correlation does not imply causation.

Construct a countermodel:

Drinking a lot and vomiting are positively correlated. Correlation is symmetric: drinking a lot correlates to vomiting and vomiting correlation to drinking a lot. So, if correlation implies causation, we could say "vomiting causes drinking a lot." Premises true, conclusion false.

First, "imply" typically carries the unstated condition "for the most part". So, scratch "QED".

Second, asserting a causal link between vomiting and heavy drinking does not assert "vomiting causes heavy drinking". It asserts "vomiting causes heavy drinking OR heavy drinking causes vomiting" -- which is, of course, quite true. The "and" is inherently bidirectional.

It's true of course that correlation doesn't always implies causation, but that's step 1 of the, and Steve is at step 3 or 4.

Cennbeorc

Anonymous said...

So you don't know what "implication" is. It's the material conditional.

Oh.

Dear.

rob said...

Anonymous said...
So you don't know what "implication" is. It's the material conditional.

Oh.

Dear.

You're playing the same technical vs colloquial language game that creationists do when they say 'evolution is just a theory.'

When people say 'imply', they usually mean 'suggest', or 'makes something seem more likely'.

David said...

Correlation means something might be going on. It is a justification for investigation. Where there's smoke, there might be fire. A correlation means little by itself, unless it comports with many other things that are proved.

But correlation shouldn't be disvalued. What would proof of causation in complex cases look like? "Correlations" appears to be a word meaning "observations that are provisionally organized." And how else would one prove causation, except by piling up observations ("anecdotes") and thinking up a model that explains most of them at a given level of knowledge? Unless I'm far wrong, that appears to be what science is.

By the way, to my knowledge no one becomes a "race realist" because of interesting correlations he reads about in the studies. He directly experiences or observes a broad range of adverse "diversity" behavior, then turns to the studies (which are full of correlations) to try to get some kind of understanding of it.