Blog post

This is not a one-year blip: If we have to have a national assessment system, it shouldn’t be this one

Oliver Belas, Senior Lecturer (Education & English) at University of Bedfordshire 27 Aug 2020

This is not a one-year blip: If we have to have a national assessment system, it shouldn’t be this one

Although calls for root-and-branch reform of our national examinations systems aren’t new, they’ve been re-energised by the controversy of this year’s – possibly illegal (Elgot & Adams, 2020) – results by algorithm. No one thought calculated grades desirable; and it’s sadly no surprise that the most disadvantaged students were the most adversely affected in the first instance (Adams, 2020), and continue – following the late decision to delay the release of BTEC results (Dickens, 2020) – to face uncertainty. But this isn’t a story of a one-year blip, caused by the Covid-19 crisis. Rather, it’s one of a fundamentally flawed and inequitable examinations system (Sodha, 2020; Richardson, 2020a; Torrance, 2018).

Ofqual rationalised standardisation as a corrective to centre assessed grades (CAGs) that over- and underpredicted students’ grades– the former especially, which, research shows, is more common (Ofqual, 2020, p.15; fn.18 for the research cited therein); see also Harlen, 2005, Torrance, 2018). Days before Ofqual decided to follow Scotland, Wales and Northern Ireland, Laura McInerney (2020) acknowledged that, ‘in a world of terrible options, calculating a rough grade was not itself a stupid idea,’ as ‘granting teacher-predicted scores would have almost doubled top grades, meaning some universities and employers would have needed to give places by lotteries, or defer large numbers of students.’ With the government’s U-turn allowing students to keep their best grades – CAG or calculated – this is exactly the position many universities and their prospective students are now in (see Weale & Adams, 2020; Weale, 2020), while lower-tariff institutions will, especially with the sudden lifting of the cap, likely struggle to recruit (TES, 2020). We’ve shifted from a system known to be flawed (Adams, Elgot, Stewart, & Proctor, 2020) but intended to minimise ‘grade-inflation’ (a politically freighted idea) to a forced acceptance of maximal inflation, and the root of the problem seems to be a pernicious mistrust in teachers (Richardson, 2020a and 2020b).

‘We’ve shifted from a system known to be flawed but intended to minimise ‘grade-inflation’ (a politically freighted idea) to a forced acceptance of maximal inflation, and the root of the problem seems to be a pernicious mistrust in teachers.’

We needn’t be in this situation. The research (Ofqual, 2020, fn.18) that shows teacher overprediction is more common than underprediction also shows accurate prediction to be more common than both; it suggests, too, a correlation between the end of modular courses and the decoupling of AS and A-levels, on the one hand, and a dip in the accuracy of teacher predictions, on the other. Teachers ‘accurately’ predict students’ final grades around half the time, though that accuracy rate is really a matching score; it says nothing about whether teachers or exam boards grade students’ work ‘better’. The correlation of teachers’ and exam boards’ rank-ordering of students, however, tends to be much stronger (Ofqual, 2020; Harlen 2005). (A case in point: I know of one departmental head who analysed three years of her department’s predicted and actual A-level grades, and then used that data to predict this year’s outcomes. She was remarkably close. Crucially, though, under her analysis, several students sitting at the border between two grades would have been given the benefit of the doubt.) There is, moreover, research which indicates high validity and internal consistency among rigorous systems of teacher assessment, as well as lower-than-assumed reliability of external assessment (not to mention the psychological impact of the latter on teachers and students) (Harlen, 2005).

Not only are modular course-design and devolved, teacher-based assessments common internationally, but they also represent the current norm in higher education in this country—and were once standard in secondary education (Harlen, 2005; Black, 1998). Bearing this in mind, given that secondary schools and exam boards are now geared up for moderation of a much-reduced proportion of centre-assessed work, and considering that we’ve ended up with a version of centre-assessed final awards anyway, one wonders why a plan for rigorous, moderated centre-based assessment wasn’t Plan A. It would have been preferable this year, and it would also have been better for the system to which we now seem bound. Again, though it can’t be proved, it’s hard not to think that the issue is one of trust: had schools and teachers been more involved in students’ summative assessments and awards, and had courses been designed modularly, some of this year’s difficulties might have been avoided. Richer data would have been available on which fairer judgements might have been based. Things wouldn’t have been perfect (could they ever be?), but they likely would have been much better.

This year’s mess should remind us of the basic inadequacies of the current examinations system. We need to redesign the mechanisms of national assessment, with teacher assessment – properly supported and resourced – at the centre. This is neither a new argument nor practice (see Harlen, 2005 and references therein). But such change would require, among other things, a re-centring of our educational culture around a presumption of trust in teachers.

References

Adams, R. (2020, August 19). Disadvantaged pupils will be biggest winners from GCSE results. Guardian. Retrieved from https://www.theguardian.com/education/2020/aug/19/disadvantaged-pupils-will-be-biggest-winners-from-gcse-results

Adams, R., Elgot, J., Stewart, H., & Proctor, K. (2020, August 19). Ofqual ignored exams warning a month ago amid ministers’ pressure. Guardian. Retrieved from https://www.theguardian.com/politics/2020/aug/19/ofqual-was-warned-a-month-ago-that-exams-algorithm-was-volatile

Black, P. (1998). Testing: Friend or foe? The Theory and practice of assessment and testing. London: Falmer Press.

Dickens, J. (2020, August 19). Pearson announces eleventh-hour grading U-turn on BTECs – telling schools NOT to issue results tomorrow. Schools Week. Retrieved from https://schoolsweek.co.uk/pearson-announces-eleventh-hour-grading-u-turn-on-btecs-telling-schools-not-to-issue-results-tomorrow/

Elgot, J., & Adams, R. (2020, August 19). Ofqual exam results algorithm was unlawful, says Labour. Guardian. Retrieved from https://www.theguardian.com/education/2020/aug/19/ofqual-exam-results-algorithm-was-unlawful-says-labour

Harlen, W. (2005). Trusting teachers’ judgement: Research evidence of the reliability and validity of teachers’ assessment used for summative purposes. Research Papers in Education, 20(3), 245–270. https://doi.org/10.1080/02671520500193744

McInerney, L. (2020, August 15). A-level students are victims of a farce Gavin Williamson had five months to prevent. Guardian. Retrieved from https://www.theguardian.com/education/2020/aug/15/a-level-students-gavin-williamson-farce

Office of Qualifications and Examinations Regulation [Ofqual]. (2020). Awarding GCSE, AS, A level, advanced extension awards and extended project qualifications in summer 2020: Interim report. Retrieved from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/909368/6656-1_Awarding_GCSE__AS__A_level__advanced_extension_awards_and_extended_project_qualifications_in_summer_2020_-_interim_report.pdf

Richardson, M. (2020a, August 18). A-level debacle has shattered trust in educational assessment. The Conversation. Retrieved from https://theconversation.com/a-level-debacle-has-shattered-trust-in-educational-assessment-144640

Richardson, M. (2020b, May 3). Teacher trust and educational assessment in the wake of COVID-19 [Podcast] Ed. Space (episode 2). Retrieved from https://anchor.fm/oliver-belas/episodes/Ed–Space-Episode-2-Mary-Richardson—Teacher-trust-and-educational-assessment-in-the-wake-of-COVID-19-edib1v/a-a23fr8m

Sodha, S. (2020, August 18). The fake meritocracy of A-level grades is rotten anyway – universities don’t need them [Opinion]. Guardian. Retrieved from https://www.theguardian.com/commentisfree/2020/aug/18/a-level-grades-universities-exams

Times Educational Supplement [TES]. (2020). A levels: Higher-tariff universities ‘eat the sandwiches’ of others. Retrieved from

Torrance, H. (2018). The return to final paper examining in English national curriculum assessment and school examinations: Issues of validity, accountability and politics, British Journal of Education Studies, 66(1), 3–27. https://doi.org/10.1080/00071005.2017.1322683

Weale, S. (2020, August 19). Durham University offers students money to defer entry. Guardian. Retrieved from https://www.theguardian.com/education/2020/aug/19/durham-university-offers-students-money-to-defer-entry

Weale, S., & Adams, R. (2020, August 17). Not all UK students will get first-choice place, universities warn. Guardian. Retrieved from https://www.theguardian.com/education/2020/aug/17/not-all-uk-students-will-get-first-choice-place-universities-warn

BERA news

BERA Announces Professor Heidi Safia Mirza as the 2025 John Nisbet Fellow

News6 Aug 2025

New Members Join BERA’s Publications Committee

BERA in the news25 Jul 2025

Call for book proposals: The BERA Guides series

News22 Jul 2025

2025 BERA Undergraduate Award Winner

News21 Jul 2025

This is not a one-year blip: If we have to have a national assessment system, it shouldn’t be this one

References

More content by Oliver Belas

More related content

BERA news