Education Select Committee Review of Primary Assessment 2017: A Response

Harvey Goldstein

The Education Select Committee has concluded its review of assessment systems in Primary schools in England, with a brief to pay particular regard to changes rolled out in 2016. The report reflects concerns about the effects on schools and students of a high stakes testing system.  In a useful summary the Committee highlights ‘negative impacts’ such as the ‘squeezing out’ of arts and humanities and other subjects by schools concerned to display a good performance on the Key Stage tests. They also point to  the stress that high-stakes testing places on teachers and pupils. On accountability they worry about the way published league tables can distort the curriculum and schools’ behaviour. They also comment on the pros and cons of the DfE proposal to introduce new baseline tests.

In this note I will focus on two of their recommendations related to accountability:  3 year rolling averages and baseline tests

3 year rolling averages

One recommendation (#15) is to replace the annual publication of school test results with a 3-year rolling average.

The precise details of how they propose to go about this is not clear.  Do they mean, for example, a simple report of KS2 results or a more nuanced value added one? In fact they entirely avoid discussing valued added comparisons and the report is written as if the term ‘accountability’ was self-explanatory.

The notion of a 3-year rolling average is advanced as a way to ‘lower the stakes’ and avoid using data from ‘small numbers of pupils’. But this shows that they have failed to understand that the way to guard against making faulty inferences from small numbers is to provide confidence intervals.   Despite the considerable amount of published work on this, some of which was made available to them, they make no reference to it. In advocating 3-year rolling averages, they also fail to mention that these averages would still be published every year, so it is really quite difficult to see how the proposal would do much to relieve stress or stop teaching to the test.  

Furthermore, if we stop single year publication of school results in 2020 and start a 3-year rolling average in 2021, then because we know the one year averages for 2019 and 2020 it is a matter of arithmetic to work out what the 2021 average is: the media would quite quickly latch on to that one. Moreover, thereafter it would continue to be possible to work out the yearly average. It seems unlikely that the DfE would happily suspend publication for the 2 years that would be needed to avoid this. So it would seem that this really is a non-starter!

Baseline tests

The report does raise several concerns about the DfE proposals to introduce new baseline tests and suggests that they be teacher assessed and not publicly reported. However, the Report has little to say about the DfE’s wish to use baseline tests to adjust KS2 tests, in a value added framework. Yet their use for this purpose is of some concern and requires careful discussion.

Most importantly, teacher assessed tests of the kind the Report recommends are really not suitable for a value added system, even though they may provide useful information to assist student learning.

Where the Report does discuss the use of a baseline measure for value added assessment, it fails to provide a critique, possibly because it realises that by recommending a teacher assessed baseline it could not be used for this purpose. In fact it seems quite likely that the DfE will go ahead with a standardised common baseline test anyway (it already has piloted this), so that the Report is already out of date and will come to be seen as increasingly irrelevant in terms of this issue.

There are, of course, considerable problems with using a baseline test to either predict individual KS2 results or to adjust school KS2 results within a value added framework. One problem is the only moderate usefulness for prediction of baseline testing at start of school.   Correlations with KS2 test scores are at most around 0.5, and rather smaller than this where subscales or components of tests are concerned. Another problem is children’s mobility, where many will change school over the primary period, requiring the use of sophisticated techniques for dealing with this.

In short, while the committee report does have some useful things to say, including advice to OFSTED to take far less account of test data when compiling its reports, it fails to follow through in terms both of its recommendations and any kind of in-depth discussion of difficult issues. This really is a pity, since there are so few bodies these days that come anywhere near being able to hold Government to account.