Back in September 2014, when the Trust formed, we started a big conversation about what assessment would look like in our post-levels world. The conversation just kept getting bigger as inter-related elements – teaching, learning, data collection, tracking, monitoring, accountability, reporting to parents and school improvement – were all brought to bear on our discussion and decisions.
The conversation took nearly a year; only then did we start building the processes, systems and professional development needed to implement. We front loaded the professional development and have remained focussed on our Data & Feedback Informed Teaching and Learning (DAFITAL) system over the past four plus years.
Below are some ideas we established at the beginning, to guide us, along with some learning along the way.
Assessment Varies Depending on the Inferences You Want to Make
Too often people start with the data available rather than conclusion(s) they wish to reach. Trying to find out what pupils do and don’t know is very different from looking to ascribe a grade or standard to their overall work. To have any real meaning the latter often needs to be referenced against a far larger cohort of pupils; at key points during a child’s education this will be at a national level.
We don’t collect centrally aggregated subject data (grades/standards) until the end of Year 5 in our primary academies or end of Year 10 at St. Mary’s. We use what we call a current grade approach; pupils sit a full season’s examination paper or statutory assessment tests – or get no marks for elements of the assessment they haven’t sat – with the examination’s mark scheme and grade boundaries used to produce a grade or mark.
We’ve also started using GL Progress tests in Years 7-9 and may well implement in Y3 & 4 as ways of keeping a check on the progress pupils are making. We are trying to answer the questions: “Is our curriculum sufficiently challenging? Are we delivering it effectively?” by looking at whether are pupils are making the same, more or less progress than their peers nationally. It’s a proxy assessment measure as other elements could equally affect the outcomes.
Be Far Less Certain
One of the challenges of using these standardised assessments is that schools tend to take different approaches to the relative importance of the test. Pupils tend to attain more highly when they perceive a test to be important than if they don’t. Pupils are left in no doubt about the importance of Year 6 SATs, GCSE and A-level examinations but standardised assessments can be much more varied.
A second difficulty relates to the curriculum covered by the standardised assessment. For example, in Key Stage 3 we don’t have an agreed teaching order for the curriculum and so it is perfectly possible for a group of pupils from one school to be unable to answer a number of questions because they haven’t covered that aspect of the curriculum rather than failing to understand it. This can have quite a negative impact on pupils’ standard age scores.
Nearly all teacher produced assessments are limited with respect to their reliability. This can be due to the construction of the questions or which parts of the taught curriculum is actually tested; there is only so much of the domain knowledge that can be tested at any one time. Add in the transitory nature of pupils’ answers to assessments; next week or month they may well answer far more questions correct or maybe not. An assessment is associated with a moment in time.
Aggregated Data Is Primarily for Leaders Not Learners
For leaders to maintain their pretence of monitoring teaching, learning, attainment & progress they need data. The data needs to be in a form that can be easily assimilated for further analysis; cue the need for teachers to enter large volumes of pretty unreliable data at an aggregated grade/number/letter/combination of them all. As data is aggregated – rather than leaving it dis-aggregated – a classroom teacher and the pupils lose important details about which aspects of the curriculum they do or don’t know. Most of our assessment data remains at a dis-aggregated level.
When we started over four years ago I didn’t realise how teachers would use the assessment information to help their own professional development. What am I teaching well and not so well? This assessment data isn’t reliable enough to be definitive but it helps point out aspects of teaching which may be worth thinking about a bit more.
Assessment Grain Size Varies between Subjects and Ages
When discussions involve school leaders responsible for children and young people from early years to post-16 the discussions are rich. Assessment grain size – the smallest piece of assessment data that is of value within a subject/age group – was part of the revelatory discursive process. I was use to subject leaders telling me that a particularly detailed and centrally formulated assessment policy would not work in their subject; they were never able to adequately articulate why. The reason is grain size.
Mathematics tends towards a very small grain size when looking to analyse what pupils do and don’t know; can and can’t do. Art is very different as the grain size tends to be much larger and more holistic. English at a secondary level is different again; a larger assessment grain size than Maths but not as big as Art. However, English in Early Years has a very small grain size as individual letters and letter sounds need to be learnt.
Lower the Stakes
The openness of teachers to look at areas for their own professional development won’t happen in a high stakes accountability or punitive system. Given the limited reliability of our assessments it would be dangerous to make cliff edged substantive decisions. Graded lesson observations are an example of where we decided the reliability of the assessment was so limited that we were unable to make any worthwhile conclusions. We stopped lesson observations a couple of years ago.
When you start unpicking elements; the house of cards starts to tumble down. This year we called a halt to the already limited performance pay system we had. We’re now challenged with building a different concept of what school improvement looks like. School improvement is more than just accountability; the latter drives too much data collection in schools.
Knowledgeable Subject Leaders are the Principal Leaders
Subject leaders need the knowledge and skills to construct the curriculum in terms of the sequential development of knowledge around key concepts and ideas. They also need to be able to develop their team in appropriate assessment practices to assess the learning. Arguments about assessment are often rooted in deeply held beliefs about their subject. Assessment is part of an iterative learning process.
The reason for the use of low stakes assessment practices, to support retrieval and secure learning in terms of retention and recall from long term memory, are now better understood. It is an interesting development given the quality and quantity of evidence behind it.
Nothing we do is perfect; we just need to be aware of the limitations of our different approaches. Time spent on assessment, producing data and tracking progress is a trade-off; we can’t use the time for other things. In the future, we may need to give up some of our assessment practices to develop even more effective ways of helping our pupils learn and our teachers teach.
With thanks to Professors Becky Allen, Robert Coe and Dylan Wiliam who have all influenced my thinking and our approach; the best of what is above will almost certainly need attributing to one or more of them.