series of guest features for students. Warren has just released a Psychology Study Guide, which covers information on statistics, research methods and study skills for psychology students.
Today I'm delighted to discuss an absolutely fascinating topic in psychology - statistical significance. I know you're as excited about this as I am!
Why is psychology a science? Why bother with complicated research methods and statistical analyses? The answer is that we want to be as sure as possible that our theories about the mind and behaviour are correct. These theories are important - many decisions in areas like psychotherapy, business and social policy depend on what psychologists say.
Despite the myriad rules and procedures of science, some research findings are pure flukes. Perhaps you're testing a new drug, and by chance alone, a large number of people spontaneously get better. The better your study is conducted, the lower the chance that your result was a fluke - but still, there is always a certain probability that it was.
In science we're always testing hypotheses. We never conduct a study to 'see what happens', because there's always at least one way to make any useless set of data look important. We take a risk; we put our idea on the line and expose it to potential refutation. Therefore, all statistical tests in psychology test the probability of obtaining your given set of results (and all those that are even more extreme) if the hypothesis were incorrect - i.e. the null hypothesis were true.
Say I create a loaded die that I believe will always roll a six. I’ve invited you round to my house tonight for a nice cup of tea and a spot of gambling. I plan to hustle you out of lots of money (don’t worry, we’re good friends and always playing tricks like this on each other). Before you arrive I want to test my hypothesis that the die is loaded against my null hypothesis that it isn't.
I roll the die. A six. Success! But wait... there’s actually a 1:6 chance that I would have gotten this result, even if the null hypothesis was correct. Not good enough. Better roll again. Another six! That’s more like it; there’s a 1:36 chance of getting two sixes, assuming the null hypothesis is correct.
The more sixes I roll, the lower the probability that my results came about by chance, and therefore the more confident I could be in rejecting the null hypothesis.
This is what statistical significance testing tells you - the probability that the result (and all those that are even more extreme) would have come about if the null hypothesis were true (in this case, if the die were truly random and not loaded). It's given as a value between 0 and 1, and labelled p. So p = .01 means a 1% chance of getting the results if the null hypothesis were true; p = .5 means 50% chance, p = .99 means 99%, and so on.
In psychology we usually look for p values lower than .05, or 5%. That's what you should look out for when reading journal papers. If there's less than a 5% chance of getting the result if the null hypothesis were true, a psychologist will be happy with that, and the result is more likely to get published.
Significance testing is not perfect, though. Remember this: 'Statistical significance is not psychological significance.' You must look at other things too; the effect size, the power, the theoretical underpinnings. Combined, they tell a story about how important the results are, and with time you'll get better and better at interpreting this story.
And that, in a nutshell, is what statistical significance is. Enthralling, isn't it?
Editor's note (07/09/2010): This post has been edited to correct for the fact that statistical significance pertains to the likelihood of a given set of results (and those even more extreme) being obtained if the null hypothesis were true, not to the probability that the hypothesis is correct, as was erroneously stated before. Sincere apologies for any confusion caused.