Using Comparative Judgement to Evaluate the Effectiveness of Interventions (Part Two)

Derek’s exploration of Comparative Judgement to evaluate the success or otherwise of an intervention to improve achievement at a community college in Raleigh, NC (USA) continues …


Our statistics course team recently used a common intervention to [attempt to] improve student achievement and understanding of our Measures of Center learning objective. Students who are proficient (grade of C or better) with this learning objective should be able to determine an appropriate measure of center, sketch a distribution, and indicate the relationship between mean and median given some contextual information. This learning objective is typically a struggle for many of our students, so the team of instructors teaching statistics designed and implemented an intervention activity covering this learning objective. Following the intervention, students were given a 1 question “quiz” that was used to measure the effectiveness of our intervention and to determine a proportion of students who are proficient with the learning objective. We decided to use CJ to conduct our evaluation and analysis.

Each instructor collected 5 randomly selected (using random.org) quizzes from each of their sections of the statistics course. The collection of these (N = 74) were uploaded to the No More Marking website. 



One quiz was not legible when scanned and was removed from the sample. Along with the students’ work, four markers, one of each letter – A, B, C, and D – were also included in the sample. Judges were instructed to choose the student who demonstrates a better understanding of Measures of Center. If asked to make a comparison between student work and a marker, the judge was to select the student work if they believed the student demonstrated understanding at or above the marker’s grade level. Students are deemed to be proficient if they score at a C or better. In total, 10 judges completed 800 comparisons on the 78 (74 students plus 4 markers) candidates.

The figure displays the results acquired from the compilation of comparisons. The red line indicates the true score of the C  marker. Thus, students ranked greater than that marker are deemed proficient while students ranked below that marker are not. From these results, 43.24% of students in the sample were ranked as proficient, yielding a 95% confidence interval of (31.96%, 54.53%).


Based on these results, we can conclude that our intervention needs some work, and our students are still struggling with this learning objective. However, CJ does seem to be useful and efficient (judges’ median time per comparison ranged from 15 to 20 seconds) in evaluating the effectiveness of our intervention and in determining a proportion of [non]proficient students.

Do you have any stories of how you are using CJ? We’d love to hear from you! chris@nomoremarking.com

Sign up to our Comparative Judgement mailing list to receive notification of new stories automatically.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s