Blog post Part of series: Artificial Intelligence in educational research and practice

The future of the artificially intelligent examination

Mary Richardson, Professor of Educational Assessment at University College London 22 May 2023

In education, it is common for people to talk about the use of artificial intelligence (AI) in assessment as something in the future, but it’s not – it’s here, and it’s been working away in the background for some years now. AI in educational settings is still viewed with a degree of suspicion and commonly met with fears relating to the ‘rise of the machines’ (Richardson & Clesham, 2021). However, given the prevalence of AI technologies across so many aspects of our everyday lives, its role in educational assessment needs attention. There won’t be many people reading this blog post who really think about the way that AI-led technologies already command aspects of our lives – from targeted advertising to satellite navigation or managing the weekly grocery shop. Such AI-led events are accepted as a part of the fabric of life and this is perhaps why they are invisible to us; it is only when they impact something really important, such as exam results, that they command our attention.

Over the past decade, there has been a significant growth in the development and application of AI within educational settings, particularly related to the value of big data, the development of algorithmic advances in testing and significant increases in computing power more generally. English language testing is at the forefront of using AI technologies within tests for selection to ensure that students have a valid and reliable assessment for entry into universities.

The loss of a ‘human touch’ in assessment raises some issues about trust in test experiences and results (Falkner et al., 2014). But our perception of new technologies is shaped by the trust we have in them, and the so-called grading ‘debacle’ in England in the summer of 2020 revealed a very flexible view of who test takers could trust when students could not sit national exams and data from their teachers was used in grade awarding. The teacher-assessment data met the statistical algorithms used to model data from markers in exam boards and resulted in a national outcry because some 39 per cent of students received lower grades than expected. The debates highlighted a poor understanding of just what algorithms do in assessment practice, including an infamous quote from the prime minister blaming a ‘mutant algorithm’ (Stewart, 2020); belief that a mathematical model might have a mind of its own! Such misunderstandings suggest we need better education about AI and its influence on assessment practice.

‘What AI brings to assessment are scoring models that are data-driven and verifiable in ways that human scoring often lacks.’

Good practice in the use of AI technologies is well documented: see the Transparency Model (Chaudhry et al., 2022, p. 2) outlining the ethics of using AI and the need for EdTech companies to explain their technology in ways that are accessible and understandable. Trust underpins the intrinsic value of any assessment, but particularly tests. What AI brings to assessment are scoring models that are data-driven and verifiable in ways that human scoring often lacks. AI systems are faster and less error-prone than human beings; they don’t harbour the ‘halo effects’ that are a natural part of the human condition, but these promises come with a caveat. The AI systems have to be trained on sound and representative samples and there are some areas or types of assessment that just don’t lend themselves to an AI-led approach. Despite recent claims from a Google employee that a chatbot was becoming sentient, it seems that we are a long way from seeing emotions rising in the machines. Good use of AI requires creative approaches to assessment; it’s not good enough to simply ‘put paper behind glass’. At the heart of assessment practice lies an imperfect, messy and complex process, and looking forward, the question to ask is not should we use more AI in assessment, but why not use more in appropriate ways?

References

Chaudhry, M. A., Cukurova, M., & Luckin, R. (2022). A transparency index framework for AI in education. arXiv preprint. arXiv:2206.03220.

Falkner, N., Vivian, R., Piper, D., & Falkner, K. (2014). Increasing the effectiveness of automated assessment by increasing marking granularity and feedback units. Proceedings of the 45th ACM Technical Symposium on Computer Science Education – SIGCSE ’14, 9–14. https://doi.org/10.1145/2538862.2538896

Richardson, M., & Clesham, R. (2021). Rise of the machines? The evolving role of AI technologies in high-stakes assessment. London Review of Education, 19(1). https://doi.org/10.14324/LRE.19.1.09

Stewart, H. (2020). Boris Johnson blames ‘mutant algorithm’ for exams fiasco. Guardian. https://www.theguardian.com/politics/2020/aug/26/boris-johnson-blames-mutant-algorithm-for-exams-fiasco

Mary Richardson, Professor

Professor of Educational Assessment at University College London

Mary is Professor of Educational Assessment in the Dept for Curriculum, Pedagogy and Assessment at UCL Institute of Education in London. She teaches on the MA in Assessment and supervises doctoral students interested in assessment, ethics and children’s rights. Mary is interested in the intersection of technical and philosophical elements of educational assessment and is currently examining the role of artificial intelligence in the practice of high-stakes testing; and leading the reporting for England on the international Trends in Mathematics and Science Studies (2023) for the DfE. She sits on Research Advisory Groups for AQA, Qualifications Wales and the NCFE. She is an executive editor for the journal, Assessment in Education. Her recent book, Rebuilding Public Confidence in Educational Assessment (UCL Press, 2022) focuses on her continued interest in how we communicate about assessment in public spaces and what this means for test takers' and their learner identities.