‘Twas the day before SATs, when all through the land, pupils were working to teachers’ astound. The Key Stage SATs are upon us again. A combination of anticipation, loathing, acceptance and fear may be in the minds of many Year 6 teachers and senior leaders today.
The end of Key Stage 2 SATs marks a significant assessment point in the academic journey of a child. For teachers and leaders, the potential impact in terms of accountability rather than the assessments themselves can lead to worry, escalating for many into fear.
In his recent address to the National Association of Headteachers’ Conference the Secretary of State for Education, Damian Hinds MP, stated:
Fear of inspection. Fear of a single set of bad results. Fear of being forcibly turned into an academy – all of this can create stress and anxiety, and that can percolate through the staff.
Ladies and gentlemen, we can do better than this ….
I intend this to replace the current confusing system of having both a ‘below-the-floor’ standard and ‘coasting’ standards for performance.
There will be a single, transparent data trigger at which schools will be offered support in this way. We will consult on how this single measure should work.
Seeking no accountability is not a realistic option nor should we, as a profession, consider it an acceptable one. However, intelligent and proportionate accountability still seems a long way off; this needs to start with us acting responsibly in our schools and intelligently within the debate, that will soon be launched. But what would you propose? What should the “single, transparent data trigger” used as a proxy to measure a school’s effectiveness, in terms of the education provided, look like?
Please note the musings below and your responses will help Headteachers’ Roundtable formulate a policy suggestion or two. I’m going to assume that the Key Stage 2 SATs are fixed and play around with other ideas.
Option 1 – Baseline Assessment followed by Key Stage 2 SATs
In its 2016 Green Paper, published shortly after the three different reception baseline assessments debacle, the Headteachers’ Roundtable suggested that it could be a decade before a viable multi-year contextualised value added measure would be in place; Key Stage 1 SATs could then disappear into history. The group proposed the use of a single standardised objective teacher-led reception baseline assessment linked to the Key Stage 2 SATs as a way of measuring value added.
I sense the reception baseline is already hitting stormy waters; when esteemed organisations pull out of the tender process I tended to get worried. A further concern is that the reception baseline may be no more, or even less, reliable than the Key Stage 1 teacher assessment data that is currently used. Without a Plan B then Option 1 is the only one on the table; time for Option 2.
Option 2 – Pupil Referenced Value Added Data
Another way to create value added measure would be to use a contextualised pupil referenced approach.
Post SATs marking, the mean standardised score of a particular sub-group is determined e.g. white, disadvantaged boys. The score for each white disadvantaged boy in a school is compared to the average for the sub-group with a +/- determined. Repeat this for other sub-groups and pupils; the school’s value added measure is the aggregated score of all its pupils’ individual pupil referenced scores.
Sub-groups could be as broad or small as made statistical sense; using gender, disadvantage, ethinicity, SEN, EAL would produce different national sub-groups to compare individual pupil’s SATs scores against. It would make sense to roll together a three year view of the data, for an individual primary school, since each year’s pupil cohort tend to be relatively small in size.
This way of determining progress could be modelled using the last two years of Key Stage 2 results as well as this years. If the approach gained general acceptance and was implemented, within the next few years, then Key Stage 1 SATs could be stopped immediately.
Schools could still do reception baseline assessments but they would be used inside the school for formative processes. Each school would be able to choose the baseline assessment that they considered would be of greatest benefit to their children’s development.
Option 3 – Using Multiple Data Sets
There is a certain statistical sense in using multiple proxy data measures to look at school effectiveness; no one approach is perfect but together they would identify the few schools, no matter how you cut the data, that don’t seem to be giving pupils a good deal.
The downside is that you may need a series of different measures. Education Datalab analysis of the 2016 Key Stage 2 data shows that depending which methodology you use as to which schools are identified as underperforming. The question is whether it is worth it is data overload or a reasonable approach given the potential high stakes of being below the new measure; it would be more beneficial if one set of data, Key Stage 2 SATs, could be cut different ways so we don’t have assessment overload.
We would be interested in what you think; please vote and tweet out to others. If we don’t offer reasonable and realistic alternatives then we may have yet another mess foisted upon us. Thanks.
If you’re involved with Key Stage 2 SATS, good luck tomorrow and all of next week; hope things run smoothly.
Thanks for adding to the debate Stephen. I have voted for a pupil referenced approach initially here as I have a feeling that this will prove to be a more reliable source of comparative data than something that stems from a baseline test. I think this option should be worth exploring, not least because it’s something we could play with now, rather than waiting for years to come. If I understand your proposal correctly, we could track back and look at how this may have been applied to cohorts from years gone by?
With the stakes potentially being so high on the outcome of the new baseline test, the temptation to try and influence the assessment so that it is as low as possible will be too much for many and therefore will render the overall progress measure an unreliable and untrusted measure of a school’s performance. There are also too many complications around children transitioning between schools and what to do with infant and junior schools or those in the 3 tier system for it to end up being successful.
I think the thing I disagree with the most with is the notion of a ‘single measure’. I disagree with Progress 8 for this reason and of course the primary progress measures. Learning is so complex and the value and quality of any school is expressed in so many different ways, both statistically and non. If we have a single measure, schools will inevitably throw their eggs in the basket to influence that measure. As Amanda Spielman (HMCI) famously said:
“But most of us, if told our job depends on clearing a particular bar, will try to give ourselves the best chance of securing that outcome”.
So for this reason, I’m also comfortable with us continuing to have a number of different measures that can be interpreted by intelligent humans (probably with the help of intelligent machines) to build a picture of how well a school is performing – this would be a good starting point for accountability conversations.
Of course my real answer to this question, like many, is ‘I don’t know’. I haven’t seen the proposed baseline and I’d like to see more flesh on the bones of both the pupil referenced approach and what the range of ‘multiple measures’ might be.
Look forward to continuing the conversation…
This is almost a blog in itself Tom; thanks for adding. Some great points.