Tuesday, September 11, 2007

Correlation vs. Causation

` In science, there are two basic types of studies; correlational and experimental. One is to determine that two or more factors are related; the other is to figure out how.

` In a correlational study, scientists are basically trying to find patterns, often by comparing two or more very similar things - ideally with only one very important difference.
` An example would be looking at the medical histories of two groups of people in which the only important difference is that one smokes cigarettes and the other doesn't. Doing this has shown that smoking is associated with a high risk of developing certain health problems including poor circulation, emphysema, heart disease and cancer.
` You have to be careful here, however, because even though it's a sensible proposition, correlation in itself does not imply causation! It is true that, for instance, drowning rates increase whenever ice cream sales go up. But does buying ice cream cause drowning - or conversely, does drowning cause people to buy ice cream?
` The cause you're looking for, of course, is called 'summer' - one factor causes both results!

` So, how do we know that cigarettes are bad for us? That's what an experimental study is for!

` Experiments not only study correlation but actually cause the difference between the study groups. So, if we were to do an experiment on seeing if smoking can cause these health problems in human beings, we would have to round up a bunch of similar individuals and randomly divide them into two groups.
` One group would be forced to take up smoking, while the other would be kept from smoking at all. But, since that's unethical and practically impossible, scientists keep a most wretched tool around the lab; the domesticated rat.
` Let's say the scientists have a group of rats from one genetic strain, all living in the same type of cage and eating the same food. Basically, what the scientists need to do is make some of the rats breathe in cigarette smoke while making sure the rest don't.
` When the rats forced to breathe in smoke start developing health problems similar to the ones found in human smokers, it now becomes clearer that cigarette smoke is a major cause of the same kinds of disorders in both humans and rats!
` Sure, it's not one hundred percent certain that cigarettes are also a significant cause of these types of lung problems, etc. in humans - because we have not performed that experiment on our own species - but we can be reasonably sure it's legitimate.

` The great thing about science is that a) there are few things that are reasonably 100% certain (like the rising of the sun as Mercury mentioned, not to mention the law of gravity or the existence of molecules) and b) there are plenty of scientists with opposing viewpoints - and evidence to back them up - so therefore there is much discussion and speculation about various possible causes of events.
` At any one time, the general consensus among scientists for a given phenomenon is the best they know. New information is being added all the time, of course, and if someone has figured out something that really works better, then that becomes the new 'best'.
` This doesn't mean that knowledge derived from science is unreliable in itself; just that there is often more than meets the eye that is waiting to be discovered. While this usually puts a new spin on previous knowledge, it sometimes can completely invalidate a hypothesis.

` Take the case of a University of Pennsylvania Medical Center study that positively linked the presence of night-lights in young children's bedrooms with their development of nearsightedness. (A tentative idea to begin with.) However, a later study by the Ohio State University found that parents who were nearsighted were most likely to have nearsighted children - and that they were more likely to put night-lights in their children's bedrooms!
` In this case, the University of Pennsylvania's conclusion - as well as the massive amounts of media attention it received - seems a bit... well, myopic, as it now would seem that the vision problems of nearsighted parents are the common cause of both correlates.

` This is one thing that, indeed, scientists know well: Though one thing may appear to cause another thing, without clear enough evidence one must always suspect alternatives. You never know what new realizations the next batch of data will bring!

4 comments:

Mercury said...

S. E. E. Quine:

That may appear to be sound reasoning in your examples, but there is one flaw...there will be now and then the individual who uses tobacco that develops zero health problems. Thus the correlation and subsequent cause cannot be a categorical statement but mere statistical probabilities. Now you are in a playing field similar to quantification of the subatomic realm where quantifiable statements do not exist and true theories of phenomena cease to exist. Simply, it is a matter of "Mort the lab rat" being a healthy, normal rodent successfully living in a toxic atmosphere. From a scientific and logical perspective it must be remembered in relating "cause and effect" that all parameters must be identified and allow for the chance opportunity for a unique situation to occur. [This incidentally, cannot happen in the realm of physics. Unique events cannot live next to generally accepted laws. If a unique event appeared, the law would have to be revised for it indicates a flaw in the original law.] The realm of human/biological activities are not that quantifiable.

I would question your statement that there is nothing absolute in science. Well, maybe. But the dialectic of observing and testing a million times certainly refines the original proposition and become closer to being absolute. According to the laws of physics and the way the solar system is established--the sun does rise in the East every morning. That's fairly absolute. There has been no invariance so far. As long as the submechanics are functioning properly we can rely on the predictability of ole Sol.

Kingcover said...

I believe that experiments have their good and bad points. For example a good point would be that we now know that smoking is bad for our health. People did not know that back in the 30s/40s/50s - you would always see movie stars smoking cigarettes in their films and even people were smoking on talkshows until fairly recently. Now, thankfully that has mostly stopped.
A bad point might be that how do you know for certain that it is cigarettes that are doing the harm since most people who smoke also consume alcohol (another known cancer starter). Also people nowadays have a bad diet - they eat lots of red meat, fast foods and are very lazy when it comes to exercise so how do you definitely know that smoking causes the majority of that person's health related problems??? The only way you could figure that out (in my humble non-scientific opinion) would be to carry out investigations on people right from birth. Give them a healthy diet the whole time and only introduce one vice that is bad for them and see if they live longer or shorter than the average person who has been exposed to different things throughout their lifetime.
Hopefully that all makes some sense.

S. E. E. Quine said...

` Yes, it does make sense, Gareth. There are plenty of variables which affect health, and in such correlational studies they are taken into account and the statistics are thus 'corrected for'.
` And how do you nudge statistics to negate those other variables? Other correlational studies.
` It's far from perfect, though it does work well enough.
` As you said, doing an experiment in which the person is kept healthy from birth and only given one 'vice' of some kind would seem to be the more accurate kind of study.
` Of course that is unethical, and medical studies that follow groups of similar people over many years (while they make their own choices) cost more millions of dollars than most are willing to spend.

` Mercury: Thanks for bringing all that up! Really appreciate it! (Kicking myself, muttering 'haste does make waste!') When I have some more free time I'll have to make corrections/go into more detail about what I meant.

Mercury said...

Kingcover:

"A bad point might be that how do you know for certain that it is cigarettes that are doing the harm...."

That is what science is supposed to determine using "bona fide" models, the scientific method, logic, control groups, and a full understanding of the parameters that too may be causal.

"The only way you could figure that out...would be to carry out investigations on people right from birth."

That may not thoroughly work for it has been suggested that genes independent of environment are causal factors. Environmental factors may just be trigger mechanisms, but not in all cases. Again, the best that can be said are propositions based on statistics and they can be valuable tools but categorical statements must be avoided for in human activity there are those that don't fit any broad statement. Political and social policies are based on statistical anaylsis.

S. E. E. Quine:

"Of course that is unethical...."

Indeed as well what I mentioned above. "A night in the arms of Venus leads to a lifetime on Mercury". The Tuskegee study of untreated syphilis in the Negro male was a shameful, unethical experiment.