Over recent months I’ve been involved in interviews for a number of posts across the Multi Academy Trust. One of our favourite questions has been, “What will assessment look like once levels are dead?” The answers have on the whole been a bit confused.
This post is based on a webinar I delivered for Optimus Education in March 2015.
A Necessary Confusion
Teachers and potential leaders are struggling to imagine life after levels, proposing that we keep with levels or that we produce our own levelling systems for Key Stage 2 or 3. It’s a necessary confusion as we think their way from our current to a new assessment system. Life after levels is primarily a curriculum issue not a data issue. I worry that too many schools are rushing to create new labels which don’t actually mean anything.
Levels were removed in September 2014, with the introduction of the new National Curriculum, and will be reported for the final time in Summer 2015 for Years 2 & 6. From 2015/16, Key Stage 2 test outcomes “will be reported as a scaled score (in Mathematics, reading and SPAG), where the expected score is 100.” There will also be Teacher Assessments for writing and other tested elements based on descriptors released by the Department for Education. There is nothing stipulated for assessment in Key Stage 3. The option of staying with levels, as we know them, doesn’t exist.
New Horizons

Photo Credit: Jason Devaun via Flickr cc
As the sun sets on the World of Levels, we need to lift our eyes to the horizon and make sure decisions about our new assessment systems are taking us in the right direction. It is time to shed some of our old assessment practices. Dylan Wiliam (2014) urges leaders to think about Decision Driven Data Collection. The key assessment decisions would seem to be, “Where is this student up to in her/his learning?” And then, “Where should s/he go next?” These questions should drive the data we collect.
Principle 1 – Assessment Must Support Learning
This is one of those BODOR moments (Bleedin’ Obvious Description of Reality) but too many assessment systems don’t support learning or at least not as effectively as they could. Problems include:
- Not differentiating the profound elements of the curriculum (big ideas and key concepts) and the journey towards them from the peripheral;
- Requiring aggregated data, in terms of levels and grades too often, leaving limit time to focus on the fine scale data that is far more useful in the interaction between teacher and learner
- Treating all subjects the same despite their being differences in the “grain size” of data that subjects find most useful.
Recently both our English and Mathematics Departments analysed their GCSE Mock Examinations. English collected four whole question marks whereas Mathematics preferred the paper to be analysed in much smaller chunks of twenty five to thirty different question marks. These differences between subjects’ assessment methods are real.
Principle 2 – Data Collected Must be Analysed & Acted On to Close the Learning Gap
The fine scale assessment data collected in the class room may take many different forms. Depending on the subject it may be at a question level, key objective level or using a taxonomy. As usual I’ll give a shout out here for the SOLO Taxonomy (Biggs & Collis 1982) as this taxonomy places the big ideas and key concepts of the curriculum as the goal and maps the journey towards them.
Schools are now awash with data but it serves little purpose unless it is analysed and acted on. At this time of year, I wonder how many schools would rename themselves Intervention High as senior and middle leaders, armed with the latest data, apply thumbscrews to targeted cohorts of students to squeeze out the last drops of progress? If we focus on the collection of data and tracking at a whole school level it should be no surprise that this is the level at which numerous interventions occur.
Summative Assessment: Of Learning or For Grading?
Teachers and leaders, when asked about the purpose of summative assessment, too often tend to talk in terms of determining a grade or level for a student. This is the product of an assessment system, driven predominantly by accountability rather than learning, which required easy to analyse information. An aggregation of learning into a number or letter was the answer.
Separating the assessment of learning from the assessment for grading is an interesting mental step.
Assessment of learning focuses on the extent to which students are acquiring greater knowledge, defined as factual, conceptual, procedural and metacognitive. It requires a curriculum to be determined and defined in terms of success criteria, learning intentions and the pre-planned associated assessments which help to exemplify the standards expected, to teachers and students. The assessments provide an outcome based upon a disaggregated view of learning, that is, the individual facts, ideas, concepts, skills and competencies students have acquired. Teachers may now analyse the assessments to determine what students didn’t know or couldn’t do and reteach to these gaps at a whole class, group or individual level. This is hugely important for teachers and students’ learning, possibly accessible to parents, but far too detailed for middle or senior leaders to use.
Assessment for grading requires an aggregation, pulling together of the individual facts, ideas, concepts, skills and competencies students have acquired, into an overall score, grade or previously a level. It would seem to make sense for this score or grade to link to a nationally recognised standard rather than be idiosyncratic to a school or even a group of schools. These tend to be hugely important to middle and senior leaders, more understandable to parents but of more limited use to teachers to improve students’ learning.
I’m wondering whether less assessment for grading and more assessment of learning is the new course we need to steer? This has massive implications for how we define tracking in our new assessment systems. Tracking becomes something which happens in the class room by the teacher rather than in the Head’s Office by the senior leaders. This requires teachers to accept responsibility, leaders to trust and everyone to hold their nerve.
The Linear Progress Illusion
“Our evidence suggests that the assumptions of many pupil tracking systems and Ofsted inspectors are probably incorrect. The vast majority of pupils do not make linear progress between each Key Stage, let alone across all Key Stages. This means that identifying pupils as “on track” or “off target” based on assumptions of linear progress over multiple years is likely to be wrong.”
Acknowledgement: Measuring Pupil Progress Involved More than Taking a Straight Line

Acknowledgement: Measuring Pupil Progress Involved More than Taking a Straight Line
More children get to the ‘right’ place in the ‘wrong’ way, than get to the ‘right’ place in the ‘right’ way!
As the graph above suggests there is an illusion of linear progress which isn’t right at an individual student level. Wiliam (2014) explains this using the limits of reliability of the tests at the end of each key stage, the different composition of tests (reading and writing at Key Stage 2 is not the same as English at GCSE though may represent a reasonable proxy measure) and children and their situations change over time. All these affect a child’s rate of progress.
Principle 3 – School Assessment Systems must Promote Aspiration and Hard Work
I’ve spent most of my professional life setting and responding to linear progression targets but is there a better way? One alternative, proposed by Wiliam (2014) would to set targets within the upper progress range, as every year there are many children who make greater than expected progress, instead of setting targets at the average rate of progress. A child entering Key Stage 2 with 2b would be set a target of 100+ or those entering secondary school at “secondary ready” to be targeted at A*-B. Remember, these are targets for children and teachers to aspire towards, not targets to be held accountable against.
This may be particularly powerful for children with low prior attainment as their future attainment is statistically the most unpredictable. It could also have significant implications for disadvantaged children as they may disproportionately end up in low ability groups or have restricted curriculum, lower targets and consequently less expectation.
With a target range, set at the upper end of what is achievable, there is the potential to create a coherent growth mindset culture that encourages students to work hard, on strategies that will be productive in increasing learning and improving their performance.
Principle 4 – Data Collection must be Meaningful & Manageable
Given the emphasis I’ve placed on shifting away from aggregated data it needs to be said that collecting fine scale assessment data for every child, in each of her/his subjects, at a school level is probably unmanageable. For schools to continue to collect aggregated data multiple times per year is not meaningful. It will undermine the former and create excessive unnecessary workload.
The current frequency of collecting grades and levels, by schools, has spawned an unreliable and pretty meaningless progress illusion based on sub-dividing grades (a, b, c or +/-). Theses sub-grades do not have any reliability but merely feed the myth of progress into our over inflated accountability system.
Whilst current grades have benefits as terminal examination or SATs approach their use early in a key stage or course is undermined by students’ lack of coverage of the curriculum or syllabus. Alternatively using current performance grades is based on the assumptions that a student will continue to perform at their current level, with obvious implications in subjects which become conceptually more difficult over time. We may be moving towards assessment systems which focus on fine scale data, addressing gaps or misunderstanding in knowledge at a class room level for years, before moving to reporting grades towards the end of a course. Is this a bad thing? Will my or your nerve hold? What will parents think?
I’m probably left with as many questions as I have answers. Part of the challenge going forward will be using these principles as guiding forces rather than reverting to less useful but more familiar assessment and data collection processes.
If I added another principle it would be that assessment must support learning across the end of a year, key stage and phase. The latter is of particular interest to the Trust.
If you think this post would be useful to share with others please click here.
Other posts which may be of interest:
Assessment: We’re Thinking Before Acting
Assessment: Teaching’s Hidden Gem
The #5MinAchievementPlan by @LeadingLearner & @TeacherToolkit
References
Assessment Principles (Department for Education)
Education Datalab (2015) Measuring Pupil Progress Involved More than Taking a Straight Line
Tidd, M. (2014) 7 Questions you should ask about any new ‘post-levels’ assessment scheme
Wiliam, D (2014) Redesigning Schooling – 8: Principled Assessment Design. SSAT (The Schools Network) Ltd
Reblogged this on BlogBJMock – Y Byd a'r Betws and commented:
Asesu heb lefelau
Reblogged this on Kesgrave High School and commented:
A really interesting read on life after levels as always by Stephen Tierney Executive Headteacher – let me know what you think!
Reblogged this on Learning Curve.
Great read. Thanks for your clarity and vision as always.
Is this the answer? Four levels of attainment based on Solo Taxonomy surely leaves us with little better than A,B,C and D or 1,2,3 and 4. The conceptual difficulty of the task will therefore the drive the level of attainment.
Assessments will need to be carefully designed so that they cover the full range of abilities that need to be tested.
What will then do with the numbers / grades?
Some statistical instrument that will yield a decimal or a grade with sub-levels?
Or are we going for a criteria driven system that celebrates mastery at prescribed levels (remember NPRA unit accreditation)?
Given the we have a numerical / grading system at GCSE why not use these descriptors, all be it in a diluted form, and show progression towards these summative targets.
What about those who will not achieve a higher grade at GCSE; well there used to be a system called CSE and ‘O’ level.
In summary I feel we should move cautiously. But remember that we have actually developed, for once a system that the end users (or at least one of them), the parents are comfortable with and also quite adept at using. They may be suspicious of our motives if we move to a ‘diluted’ form of reporting and evolve away from OFSTED in it’s current form at the same time.
Just at the time when the parents feel that we are accountable in the same way as they are at work we move to a position that they may percieve as being clouded in educational jargon and a little less decipherable.
Reblogged this on MisterE Teacher and commented:
I, like many, continue to be ‘baffled’ by this new life. Found this post from earlier in the year very interesting, informative and useful. Many thanks @leadinglearner