Here are the correlations.
He seemed disappointed that the correlations for the creative writing test were so low. What he hadn’t considered was impact of measurement error in the tests.
If we want to know the relationship between the constructs the tests are attempting to measure, we need to compensate for measurement error. Spearman invented a simple technique called the disattenuated correlation to do just this.
Let’s assume that the reliabilities of these tests are 0.9 for Key Stage 2, 0.9 for the generic baseline test, and 0.75 for the writing test (we don’t all agree on writing so let’s allow a lower reliability!). If we look at the disattenuated correlations we see that the correlations are higher.
The correlations between the creative writing tests and KS2 are 0.64. We could interpret this to mean that the constructs are related but not identical. This seems reasonable, as KS2 doesn’t test creative writing. Interestingly, the correlation between the KS2 and the baseline test is 0.95. The school is learning very little from their baseline test as it is testing the same constructs as KS2. The best that could be said from the test is that they are reducing the measurement error through repeated testing.
So what do correlations tell us about the validity of tests? Very little. A high correlation suggests that your test is redundant. A low correlation tells you that you are testing something different, but you have no idea what. Paul Newton, the co-author of an excellent text on validity, puts it like this:
“The classic approach to validation involved correlating results from a new test against results from an already established one. In other words, results from the established test provided the ‘criterion’ against which to judge results from the new test. In theory, a high correlation coefficient provides strong evidence that the new test is measuring essentially the same thing as the established test, the criterion measure. In practice, because it is so hard to provide plausible criterion measures, low correlations are hard to interpret, and even high correlations do not necessarily mean that the right thing has been measured.”
Paul prefers a lifecyle approach to validation. If you want to know if a test is useful for your purpose, you need to interrogate: