Free personality tests are more reliable and efficient than the paid variety

In most areas of life, we expect the free versions of products to be sub-standard compared with the “premium” paid-for versions. After all, why would anyone pay for something if the free equivalent were better? However, a new study of personality tests boots this logic off the park – psychologists at the University of Texas report in the Journal of Psychology that free tests are more reliable and efficient than their paid-for, proprietary counterparts.

To measure test reliability, Tyler Hamby and his colleagues dug out personality test data collected in five prior meta-analyses of the Big Five personality traits. Meta-analyes combine data from many studies in a given field, and the Big Five is the dominant personality theory in contemporary psychology, which breaks personality down into five main dimensions, including Extraversion and Conscientiousness. In the end, the researchers ended up with usable data from 345 samples from 288 studies involving 161,091 participants.

Crucially, 142 of these research samples had completed free personality tests such as the Big Five Inventory and various versions of the International Personality Item Pool. The other samples had completed paid-for personality tests such as versions of the NEO Five Factor Inventory. For their analysis, Hamby and his team compared the “alpha coefficients” of the different personality tests – for any given test, this essentially involves looking at the scores from the questionnaire items that supposedly measure the same trait and seeing how well they correlate with each other. If a test has what’s known as good internal consistency, then the scores for its items that measure the same construct should show a high correlation.

A note of caution: The researchers didn’t look at test–re-test reliability, a different measure which tells you how well participants’ test scores correlate when they take the same test at different times. Nor did they compare the tests’ validity, which is the evidence for whether the tests are truly measuring what they’re supposed to be measuring. In other words, this study certainly shouldn’t be taken as the final word on the merit of free and paid-for personality tests.

These caveats aside, overall there was a small, but inconsistent (applying to some traits but not others) difference in reliability between free and paid tests, in favour of the free tests. But it doesn’t end there. When you use alpha coefficients to measure internal consistency in this way, the outcome is confounded by the number of items in the test. Longer tests with more items tend to achieve higher reliability scores. This is relevant to the current investigation because paid-for tests tend to be much longer (80 per cent on average) than the free versions. When Hamby and his colleagues controlled for test length (by estimating reliability at 12 items for each trait), they found the free tests had higher reliabilities than the paid-for tests for all five personality traits.

Is there any reason for paid-for tests to be less reliable? The authors say their findings are not entirely surprising – one possible explanation is that researchers or practitioners who use paid-for tests are often forbidden from adapting them in any way (for example, adding/removing items or changing the wording of items). This is to protect the proprietary status of the product, but of course forbidding any changes is unscientific because it prevents progress by making it impossible to test whether revised versions would be superior.

At least for research purposes (as opposed to in applied settings), these new results stack heavily in favour of free tests. Not only do free tests match or exceed the reliability of paid-for tests, they are also shorter which helps encourage participants to complete all test items and reduces participant drop-out rates. “Assuming that a particular scale has been properly validated, we tentatively recommend using free scales to measure Big Five traits in personality research,” the researchers said. It will be interesting to see if this finding applies to other areas of psychology research where free and paid-for tests are available.


Hamby, T., Taylor, W., Snowden, A., & Peterson, R. (2015). A Meta-Analysis of the Reliability of Free and For-Pay Big Five Scales The Journal of Psychology, 1-12 DOI: 10.1080/00223980.2015.1060186

further reading
Our bias for the left-hand side of space could be distorting large-scale surveys.

Post written by Christian Jarrett (@psych_writer) for the BPS Research Digest.

Our free fortnightly email will keep you up-to-date with all the psychology research we digest: Sign up!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s