Thursday, November 05, 2015

Let's be honest

Bloxham, S., den-Outer, B., Hudson, J., & Price, M. (2015) Let’s stop the pretence of consistent marking: exploring the multiple limitations of assessment criteria. Assessment & Evaluation in Higher Education, 23 Mar 2015, 1-16
Unreliability in marking is well documented, yet we lack studies that have investigated assessors’ detailed use of assessment criteria. This project used a form of Kelly’s repertory grid method to examine the characteristics that 24 experienced UK assessors notice in distinguishing between students’ performance in four contrasting subject disciplines: that is their implicit assessment criteria. Variation in the choice, ranking and scoring of criteria was evident. Inspection of the individual construct scores in a sub-sample of academic historians revealed five factors in the use of criteria that contribute to marking inconsistency. The results imply that, whilst more effective and social marking processes that encourage sharing of standards in institutions and disciplinary communities may help align standards, assessment decisions at this level are so complex, intuitive and tacit that variability is inevitable. We conclude that universities should be more honest with themselves and with students, and actively help students to understand that application of assessment criteria is a complex judgement and there is rarely an incontestable interpretation of their meaning.

"Accepting the inevitability of grading variation means that we should review whether current efforts to moderate are addressing the sources of variation. This study does add some support to the comparison of grade distributions across markers to tackle differences in the range of marks awarded. However, the real issue is not about artificial manipulation of marks without reference to evidence. It is more that we should recognise the impossibility of a ‘right’ mark in the case of complex assignments, and avoid overextensive, detailed, internal or external moderation. Perhaps, a better approach is to recognise that a profile made up of multiple assessors’ judgements is a more accurate, and therefore fairer, way to determine the final degree outcome for an individual. Such a profile can identify the consistent patterns in students’ work and provide a fair representation of their performance, without disingenuously claiming that every single mark is ‘right’. It would significantly reduce the staff resource devoted to internal and external moderation, reserving detailed, dialogic moderation for the borderline cases where it has the power to make a difference. This is not to gainsay the importance of moderation which is aimed at developing shared disciplinary norms, as opposed to superficial procedures or the mechanical resolution of marks."

It's quite easy to criticize this paper - small scale study (n=24), no attempt at statistical analysis or validation. But there's still an inescapable feeling that as the stakes have escalated, HE is kidding itself about assessment practices.

1 comment:

  1. Comment via email from Dr Phil Langton:
    "There are times when surely there is no place for statistics - one is when the study is very small and statistics is then a weak tool, the other is when the essence of the findings are immediately obvious even if the explanation is not. This is one such case. I think some common sense descriptions suffice here - rank orders etc.

    The study is small scale but I doubt that it would look much different if there was a 0 on the end of the number who took part. It would be a lot harder to talk to everyone and hold all the conversations in one's mind - as a first attempt to comprehend the possible drivers for decision making and judgement. Marking, after all, is judgement. I'm reminded of the wonderful clip from Dead Poets Society - - the graph representing importance and excellence in poetry. The quest for exactitude in awarding marks to science argument is not different.."