Blog post

What can experiment tell us in educational research?

Gary Thomas, Professor at University of Birmingham 8 Oct 2020

In education, we’ve put a straitjacket around our notion of experiment. We adhere steadfastly to what Parlett and Hamilton (1972) famously called the ‘agricultural-botany paradigm’. But the agricultural-botany experiment isn’t the way that ‘experiment’ is thought about across the broad spectrum of the natural and applied sciences. Everywhere else, scientists are much more open-minded about what an experiment might be.

‘There have been three incarnations of enthusiasm for experiment-based inquiry in education research – all beginning with zeal and gusto, but each ending in disappointment.’

In my new article in the British Educational Research Journal (Thomas, 2020), I examine our loyalty to the agricultural-botany experiment. I take a brief look at the history of experiment in education and conclude that there have been three incarnations of enthusiasm for experiment-based inquiry in education research – all beginning with zeal and gusto, but each ending in disappointment.

The first incarnation was in the interwar years, which followed the successes of experimental psychology in the 1920s but ended in pessimism about what experiment could tell us. The next – the second coming – began in the 1960s, lasted until the 1980s, and followed Campbell and Stanley’s 1963 exegesis on experimental design, which taxonomised experiment forms. This spawned a new enthusiasm and a fresh tranche of large-scale projects. These again ended in disappointment, though, as it appeared that major, multimillion dollar interventions such as Headstart were having little effect. Indeed, in 1981, Gene Glass – perhaps the leading quantitative researcher of education in the 20th century, and one who had been heavily involved in the post-60s tranche of experiments – came to the conclusion that ‘…the deficiencies of quantitative, experimental evaluation are thorough and irreparable’ (Glass & Camilli, 1981, p.23).

The third coming happened around the turn of the millennium, after the disappointments of the experimentation during the second tranche were attributed to inadequate randomisation in the earlier experiments (see Cook, 2001). The solution? Randomisation. And off we went again with a third tranche – of randomised experiments emerging from work funded by the What Works Clearinghouse in the US and the Education Endowment Foundation in the UK.

We’re now at a point where evaluations have been made of these third, post-millennium tranche findings. These evaluations repeat the findings from the first and second tranches of experiment: the impact of interventions, these evaluations tell us, is routinely low (Malouf & Taymans, 2016; Lortie-Forgues & Inglis, 2019). The comments of the evaluators of the new experiments mirror those of the evaluators of the tranche of experiments that came before: US evaluators concluded that their findings about the 21st century tranche of What Works studies painted a dim picture of the evidence base on education interventions, while in the UK evaluators concluded that the current tranche is yielding ‘small and uninformative effects’.

All this instils an odd sense of déjà vu.

What should we learn from these repeated disappointments? I argue in my paper (Thomas, 2020) that a potential explanation for the frailties of experiment (that is to say, experiment in the agricultural-botany tradition) in social research rests in the power law principle – more commonly called Pareto’s principle or, in different domains of study, Zipf’s law or the ‘law of the vital few’. The underlying idea here is that the stability-of-effect assumptions that dominate much social research using experiment are misplaced. The influence of particular facets of social life is pervasive, unstable and disproportional, such that they will always overwhelm the influence of interventions of interest.

These influences cannot be dismissed merely as ‘noise’. They are a fundamental part of the social landscape and they will have their effect not simply by virtue of their value, but via their interaction with other variables, activating or deactivating the potency of other potential determinants of change. It follows that a few highly significant variables may determine the ultimate effectiveness of most interventions. More than this, the influence of these variables may increase, decay or fluctuate with time, making interaction effects complex and unpredictable. Pareto’s principle offers a means of understanding the apparently nugatory and/or short-lived impact of much education innovation, as well as the inadequacies of formal experiment (in the agricultural-botany tradition) to assess any such impact.

It’s time to get out of the straitjacket and to take a more catholic view of what experiment might be in education research.

This blog is based on the article ‘Experiment’s persistent failure in education inquiry, and why it keeps failing’ by Gary Thomas, published in the British Educational Research Journal on an open-access basis.

References

Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi‐experimental designs for research. Boston, MA: Houghton Mifflin Co.

Cook, T. D. (2001). Reappraising the arguments against randomized experiments in education: An analysis of the culture of evaluation in American schools of education. Chicago: Northwestern University, Chicago.

Glass, G. V., & Camilli, G. A. (1981). ‘Follow Through’ Evaluation. Viewpoints (120). Washington DC: National Institute of Education.

Lortie-Forgues, H., & Inglis, M. (2019). Rigorous large-scale educational RCTs are often uninformative: should we be concerned? Educational Researcher, 48(3), 158-166.

Malouf, D. B., & Taymans, J. M. (2016). Anatomy of an evidence base. Educational Researcher, 45(8), 454-459.

Parlett, M. & Hamilton, D. (1972). Evaluation as illumination: A new approach to the study of innovatory programs. Occasional paper. Edinburgh: Edinburgh University Centre for Research in the Educational Sciences.

Thomas, G. (2020). Experiment’s persistent failure in education inquiry, and why it keeps failing. British Educational Research Journal. Advance online publication. https://doi.org/10.1002/berj.3660

Gary Thomas, Professor

Professor at University of Birmingham

Being of a nervous disposition as a child, Gary Thomas failed to write anything on his 11-plus examination paper, which inaction took him to secondary modern school. His subsequent zigzag through the education system gave him broad experience of its good and bad sides. He eventually became a teacher, then an educational psychologist, then a professor of education at the University of Birmingham (and four previous universities) where his teaching, research and writing now focus on inclusive education and the methods used in social science research. He has led a wide range of research projects and has received awards from the AHRC, the ESRC, the Nuffield Foundation, the Leverhulme Trust, the Department for Education, charities such as Barnardos and the Cadmean Trust, local authorities and a range of other organisations. He has written or edited more than 20 books and lots of boring academic articles.

BERA news

BERA & Black History Month 2025

News1 Oct 2025

2025 Early Career Researcher Career Development Fund Recipients

News18 Sep 2025

BERA journals virtual issue: Evaluating the worth of race, ethnicity and education over the last five years

News4 Sep 2025

Announcing the 2025 BERA Educational Research Book of the Year shortlist

News3 Sep 2025

References

More related content

BERA news