NAPLAN: Evaluation is just good policy
Teachers' Journal, Vol 123 No 2, 9 March 2018, p5
The fact that the chairperson of ACARA, Professor Steven Schwarz, has come out in defence of NAPLAN is hardly a surprise. It would be a poor leader who didn’t advocate for their organisation and the work that they do.
What is most surprising is not Schwarz’s low opinion of teachers, but ACARA’s ongoing resistance towards evaluation. The emphasis on evaluating policy as an expectation of what governments and authorities should undertake as a normal part of their obligations has been around for decades, at least since Lasswell’s 1956 policy cycle outlined a rational, cyclical method for developing, implementing and evaluating public policy. It is good public policy to focus on the intended results and unintended consequences of policies like NAPLAN so that improvements can be made, aims reformulated or the project terminated.
Ten years after NAPLAN was first conducted, it is high time to systematically evaluate NAPLAN and MySchool. As it currently stands, NAPLAN has not been systematically evaluated by independent experts with an aim to determine the merit and worth of the policy, whether it is meeting its objectives, how it might be improved and how it impacts key stakeholders such as children, parents, teachers and principals. While ACARA produces technical reports, politicians have set up parliamentary inquiries and academics have conducted a range of research projects, these are not adequate as systematic evaluations of the intended results and unintended consequences of NAPLAN and MySchool.
This is particularly important for NAPLAN because in its development I’ve long argued that policymakers made a mistake. The two aims that policymakers have for NAPLAN: “to help drive improvements in student outcomes and provide increased accountability for the community” (ACARA, 2011) can confound each other in practice. The first aim of driving improvement relies on the belief that more and better data enables better intervention and monitoring. The second aim of NAPLAN, to provide increased accountability, is driven by the logic that holding individuals and organisations to account motivates them to do better. A structural problem with NAPLAN is that using it for accountability purposes while also proposing to use it for diagnostic and educative purposes can work against each other. As Nichols and Berliner argued in the US, when you try to use test scores to hold teachers, principals and schools to account, you can corrupt the measure that you are using, and this can prevent the realisation of policy goals and objectives. This is Campbell’s Law, which stipulates “the more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor”.
In a recent article published in Assessment in Education: Principles, Policy & Practice, David Rutkowski made a strong case for jurisdictions to engage regular meta-evaluations of their large-scale assessments in order to improve them and make them more responsive to contemporary needs. Rutkowski argues that tests like NAPLAN are already evaluations, so what is needed is a meta-evaluation that evaluates the testing regimes. Evaluations should be undertaken independently to ensure quality control, and they should include active participation by stakeholders to put some level of evaluative power into the hands of stakeholders “so that they can – at least to some degree – determine the merit and worth of these [national and] international evaluations in their local context and thus create clear suggestions of what is needed to improve”. Rutkowski, borrowing from Slavin, suggests a meta-evaluation would consider such criteria as the validity of data and its inferences, the credibility of those designing and administering NAPLAN, the clarity of the data presented to stakeholders such as individual student reports and school comparisons on MySchool, the ethical or propriety of the tests and the cost-utility, both in dollars and time, of the programme. The range of these criteria also explain why it is not simply enough to let the economists at the Productivity Commission do a cost-benefit analysis and call this an evaluation.
It would be good to think that, through a commitment of funding to an external, independent evaluation of a team that represented a broad, rather than partisan, range of education stakeholders, we might make progress in figuring out what to do about NAPLAN and MySchool. The worst-case scenario is that we are still having the same conversations in 2028.
Greg Thompson
Associate Professor, Faculty of Education,
QUT School of Teacher Education and Leadership