Meta-analyses can play an important role in educational research. Aggregating results from different, but comparable studies, to demonstrate average effect sizes can be highly informative. However, the validity of a meta-analysis mean depends on the quality of the studies included.
In my recently published paper (Fullard, 2023), I review 42 randomised control trials (RCTs) from a meta-analysis performed by Fletcher-Wood and Zuccollo (2020) investigating the effect of teacher professional development on pupil outcomes. Of the 42 RCTs reviewed only 10 are valid tests of the meta-analysis hypothesis. Moreover, when the invalid tests of the meta-analysis hypothesis are excluded, the meta-analysis mean falls to 0. This demonstrates that the positive effect reported by Fletcher-Wood and Zuccollo (2020), and the subsequent policy conclusions, are entirely driven by poor research methods.
One of the conclusions from my paper is that a general improvement in empirical methods in education research is necessary to help researchers a) design more robust experiments, and b) evaluate the quality of existing experiments.
‘A general improvement in empirical methods in education research is necessary to help researchers a) design more robust experiments, and b) evaluate the quality of existing experiments.’
Designing more robust experiments
Roughly 25 per cent of the RCTs that are included in Fletcher-Wood and Zuccollo (2020) have statistically significant differences on key covariates – this means that the control group cannot be used to estimate what would have happened to the treatment group, had they not received the treatment. This suggests poor design. In a large RCT, where data on schools, teachers and students is collected before the intervention(s) takes place, researchers should check if there are any differences between groups at baseline and, if necessary, re-randomise.
Experimenter demand effects (EDE) could also bias the estimates in this context and the studies reviewed do very little to mitigate these potential sources of bias (for instance a placebo treatment arm). For example, teachers know they are taking part in a study, and this might change their behaviour. This is problematic because EDE are likely to be stronger in the treatment group (those who are more actively involved in the study) than the control group (business as usual). In this setting EDE are likely to exist and produce upward bias.
As the impact of teacher professional development (PD) on pupil outcomes is generally modest, without improvements in experimental design, any positive effects observed – and attributed to PD – could be driven by EDE.
Effectively evaluating existing studies
Researchers preforming a meta-analysis of RCTs should have the training to be able to identify if an RCT was carried out successfully or not. For instance, studies that do not have a control group should easily be identified as invalid – something that Fletcher-Wood and Zuccollo (2020) failed to do.
Researchers should also be able to identify if an experiment is designed to estimate what they are interested in. Many of the studies reviewed are investigating the effect of multiple interventions on pupil outcomes, not just PD. This is not necessarily a problem if the RCT is designed to identify the effect of PD individually by using multiple treatment arms. However, eleven RCTs are included in Fletcher-Wood and Zuccollo’s (2020) meta-analysis that have multiple interventions in one treatment group – these experiments are not designed to estimate the causal effect of PD and should not be included because the effect of professional development is not cleanly identified.
The findings from the meta-analysis performed by Fletcher-Wood and Zuccollo (2020) are highly influential – cited as evidence for the effectiveness of teacher professional development (PD) on pupil outcomes in government reports (Ofsted, 2023; DfE, 2020), academic papers and used to inform a cost–benefit analysis (Van den Brande & Zuccollo, 2021) that finds an increase in PD will have a net societal benefit of over £61 billion. Yet my recent investigation shows that the positive meta-analysis mean they report is entirely driven by poor research methods (Fullard, 2023). When the invalid studies are excluded from the meta-analysis, the selection criteria is adjusted and a valid study that was inappropriately excluded is included, the meta-analysis mean falls from 0.09 to -0.008.
Department for Education [DfE]. (2020). National Professional Qualification (NPQ): Leading teacher development framework. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/925511/NPQ_Leading_Teacher_Development.pdf
Fletcher-Wood, H., & Zuccollo, J. (2020). The effects of high-quality professional development on teachers and students: A rapid review and meta-analysis. Education Policy Institute. https://epi.org.uk/publications-and-research/effects-high-quality-professional-development/
Fullard, J. (2023). Invalid estimates and biased means. A replication of a recent meta-analysis investigating the effect of teacher professional development on pupil outcomes. Social Sciences & Humanities Open, 8(1), 100605. https://doi.org/10.1016/j.ssaho.2023.100605
Office for National Statistics [ONS]. (2023). Independent review of teachers’ professional development in schools: Phase 1 findings. https://www.gov.uk/government/publications/teachers-professional-development-in-schools-phase-1-findings/independent-review-of-teachers-professional-development-in-schools-phase-1-findings
Van den Brande, J., & Zuccollo, J. (2021). The effects of high-quality professional development on teachers and students: A cost–benefit analysis. Education Policy Institute. https://epi.org.uk/wp-content/uploads/2021/04/EPI-CPD-entitlement-cost-benefit-analysis.2021.pdf