I decided to take a closer look for them. I took some scripts from the autumn mock and some scripts from the spring mock and mixed them up in one judging pot. I then got a couple of PhD students to judge them all together. As usual the students were blind to the purpose of the study or when the tests had taken place. Using the results of their judging I was able to anchor the two judging sessions together so they were on the same scale and make direct comparisons.
The results did indeed seem to show a slight improvement. Statistical tests, however, showed that this improvement was not significant, and probably just sampling error. There was certainly no evidence of the miracle the school was claiming!
Of course the school didn’t believe me – they were convinced they could see improvement. I was not surprised. They had been teaching the students all year, and couldn’t believe their efforts may not have translated into results. Where we want to see improvement, we find it. Without objective measurement we can convince ourselves of anything.