you're reading...
Accountability, Assessment, OFSTED, Redesigning Schools

The Data Conclusion Confusion

In March 2013 Tom Sherrington wrote what I think of as one of his most iconic blogs, The Data Delusion.  He concluded, “On average it is a bit more complicated than that.”  Having just re-read Dylan Wiliam’s (2014) Principled Assessment Design here’s my own take on a similar theme.

“Some conclusions are warranted on the basis of the results of the assessment, and others are not. The process of establishing which kinds of conclusions are warranted and which are not is called validation and it is, quite simply, the central concept in assessment.”

Wiliam, D (2014)

Photo Credit: Sam Chenennou via Flickr cc

Photo Credit: Sam Chenennou via Flickr cc

The RAISE Data Conclusion Confusion

To kick off the data conclusion confusion many secondary head teachers won’t need to look much further than their RAISE online.  As Schools Week rightly informed us, this year’s RAISE is based on the data from the end of Key Stage 2 when, in 2010, a number of primary schools boycotted the SATs.  Pupils were twice as likely to be awarded the top grade via teacher assessments then if they took the SATs.  The conclusion is immediately reached that teachers are far more generous in their awarding of higher levels and that is the problem.  However, another conclusion could be that schools under greatest scrutiny felt more compelled to take the SATs and these are predominantly lower attaining schools in disadvantaged area and the data is merely reflective of this.  Add in a few more conclusions of your own and you’re left with a RAISE document which may contain green or blue than it should linked to a test boycott five years ago rather than the quality of secondary education received.  Ofsted Central has released a clear and sensible statement on the matter but whether it actually reaches the Ofsted local team inspecting a secondary school anywhere near you someday soon goodness only knows.

“Our inspectors are used to dealing with statistical anomalies and cases where data might not be reliable, so we have every confidence that schools will not be disadvantaged over this issue. We have been discussing the issue with stakeholders and will shortly be issuing guidance to inspectors on this specific matter.”

You may want to check what percentage of your Year 11 students were given teacher assessments as opposed to SAT scores in 2010.

The Manager Data Conclusion Confusion

Photo Credit: Janneke Staaks via Flickr cc

Photo Credit: Janneke Staaks via Flickr cc

The managerial dimension of many people’s role in schools has become increasingly data rich, driven or cursed depending on your view of data.  Huge conclusions are sometimes being reached on limited evidence.  Take for example scrutiny of a teacher’s marking; sampling a set of books which are found to be unmarked leads to a conclusion the teacher does not mark books and her/his competence needs to be questioned.  This is what Dylan Wiliam would refer to as a construct under representation.   Whilst the language is quite technical it is essentially saying the assessment is too small and fails to assess all the things it should to reasonable conclude that the teacher’s competence needs to be questioned.  A lot of our assessment or monitoring of teachers is built upon sampling a very small percentage of lessons or books or plans and so any conclusions reached need to be couched in very tentative terms.  The triangulation of all these different elements plus the pupils’ performance in tests and examinations, over time, need to be taken into account.  Too many of us leaders have been guilty of this in the past; time to collectively turn over a new leaf.

By the way, if you sample a set of books and find no or inadequate marking ask the teacher to bring you all the books s/he has where they think there is some marking which has really helped pupils learn.  If s/he turns up empty handed a pretty solid conclusion could be reached; the same if s/he turns up with bags full of really high quality feedback and learner responses focused on the big ideas or concepts in their subject.  The interesting one is where s/he turns up with lots of evidence of marking that appears to have very little impact on learning over time.  I’d conclude you’ve got a hard working committed member of staff who needs some professional development.  Conclusions are all important.

The HMCI Data Conclusion Confusion

John Tomsett in a wonderful little tweet sent out the two following pieces of information soon after the release of HMCI’s Annual Report identifying the North/South divide.

JT Tweet - Deprivation

For a Chief Inspector who likes to deliver a blunt message I think SMW should have gone with, “The South need to Start Doing Worse”; in a World of comparable outcomes another conclusion from the report is the South do indeed need to do worse for the North to improve its outcomes.  The primary versus secondary school comments were interesting and I wonder why with the same national cohort 78% of children achieve the L4+ Reading, Writing and Mathematics standard at the end of Key Stage 2 whereas 56% achieve the secondary benchmark of 5+A*-CEM.  Are 20% of children really going backwards in secondary schools or are they two totally different assessment systems of a school’s effectiveness from which comparable conclusions are so fraught with difficulty it is probably best not to make them.

There are various issues at play here including a simple analysis that could prevent you looking deeper into the issue.  I’m convinced that part of Blackpool’s challenge is to effectively raise the attainment of disadvantaged white working class children and young people.  It’s the same problem many other areas of the country are struggling to resolve; it’s just in Blackpool there is a far higher percentage of these children.  To improve we need to remove the focus on North/South and instead find schools which are being particularly successful with this sub-group.   What are they doing; is it transferable and scalable?  When you want to improve the conclusions you reach using data are all important.

JT Tweet - Ofsted Report

Other people may suggest that part of the success London has experienced is down to the significantly higher levels of funding.  In the New Year, we will be hit by the Fairer Funding Data Confusion Conclusion with one London Mayor Hopeful already suggesting that the capital’s schools could lose £800 million per annum as their current high levels of funding are reduced towards the mean.  Not surprisingly I came up with another conclusion, namely, that if we want a World Class Education System and all schools to perform at the level of London’s schools than we need to move every other school’s funding level upwards .  How the money is used matters a lot which is why it is difficult to find causal links between funding and educational outcomes.

The whole headline grabbing “less than 60% of the children attend good or outstanding schools” is based on Ofsted’s own data in which some schools’ effectiveness assessment includes dealing with significant numbers of children and young people from disadvantaged backgrounds and with only low levels of funding whereas other schools’ effectiveness is judged without these.  One set of assessment includes elements that are irrelevant to the other.  It’s possibly not surprising, in such a complex scenario, that you have a kind of construct-irrelevant variance creeping in; things included in the assessment that shouldn’t be if you want to compare like with like, namely the construct of school effectiveness against the Ofsted grade given.

The Teacher Recruitment Crisis Data Confusion Conclusion

Surely it’s not possible that we have more teachers in schools than we have ever had and a looming teacher shortage crisis.  Watch for the data to be bounced around over the next twelve months with claims and counter claims.  In reality it is complex and unpicking the teacher shortage crisis cannot simply be done by looking at the current data.  The two conflicting viewpoints and pieces of data can be reconciled by realising that we will also have record numbers of children within our schools by the middle of the next decade and the additional number of teachers doesn’t sufficiently cover this increase.


Photo Credit: Tom Woodward via Flickr cc

The complexity is also partially due to what people mean by the construct teacher shortage.  Many school leaders, governors, parents and children would include within the construct that there were enough teachers of a high quality or with the right subject knowledge, training and expertise in their school.  Having more than enough primary, English, History and Physical Education teachers is of little comfort if you are trying to appoint a Mathematics, Science, Modern Foreign Language or Design Technology teacher.  When talking about a teacher shortage some people focus on data about overall numbers whilst others are focussed on quality and expertise.  I actually fear we may have both of these problems by the mid-term of this Parliament.

The Data Is What the Data Is

The data is not the problem, the data just is.  The problem is created by the firmly held, strident and rigid conclusions reached, bordering almost on universal truths in our own mind, on the data available.  Sometimes these conclusions aren’t actually supported by the data.  At other times there is other data which impacts on any potential conclusion that we are unaware of or have conveniently chosen to ignore.  The level of certainty we express is often too great; we need to be a bit more tentative with the conclusion we are reaching.  I often think data is better at asking questions than it is in providing answers.  Another note to self.


Wiliam, D (2014) Redesigning Schooling -8: Principled Assessment Design.  SSAT (The Schools Network) Ltd



4 thoughts on “The Data Conclusion Confusion

  1. Great post, as ever, Stephen, with lots of excellent points. I’d like to follow up your discussion about judging teaching by marking, and claims based on what might be assumed about the impact of written feedback to children.

    There has been a flurry of discussion about the effectiveness and the managerialism of marking of late, with posts from David Didau (http://www.learningspy.co.uk/leadership/the-sane-way-to-monitor-books/), Ros McGill (http://teachertoolkit.me/2015/11/25/worksampling/), greg Ashman (https://gregashman.wordpress.com/2015/11/26/is-book-sampling-valid/) and Andy Day (https://meridianvale.wordpress.com/2015/11/25/if-venus-de-milo-did-feedback-what-reach-she-could-have-had/), amongst others.

    It seems to be a given that many people echo your sentiments that an effective teacher will be able to show you “some marking which has really helped pupils learn”. Equally, you worry that an ineffective teacher will have done a lot of “marking that appears to have very little impact on learning over time.” Learning is, as is often said, a very complex process. It would be very hard to be certain that effective written feedback is the sole reason for a child’s increased understanding ‘over time’. Correlation and causation are extremely hard to disentangle, as we all know.

    The simple question remains: Given the opportunity cost, is extensive written marking the best use of a busy teacher’s time? Can we really justify the claims made for marking, or is it simply an unquestioned orthodoxy? I’d suggest that the evidence for the effectiveness of marking is, as so often, mixed. As a profession, we need to be careful not to accept this preferred teaching style without question, as I explained here: http://icingonthecakeblog.weebly.com/blog/new-ofsted-orthodoxies-triple-marking.

    Marking is simply another – intensely time-consuming – tool in a teacher’s arsenal which may or may not aide a particular child’s growing understanding. But the children learn, not us, and I’d suggest that we need to be extremely careful not to judge teachers by the efforts of their pupils alone, particularly when we cannot be sure what exactly has helped children to learn, or not, as the case may be.

    I’d be interested to read your thoughts on this.

    Posted by jackmarwoodiotc | December 13, 2015, 10:50 am
    • Great points Jack and thanks for such an extensive comment. Like you I’ve read the marking posts you mentioned and have blogged about it the last few weeks; it’s very much a live issue for us and I want us to make the best decision possible.

      Posted by LeadingLearner | December 13, 2015, 12:34 pm
    • I am an active Director of Incyte International – a company that consistently, on a day to day basis, gets its hands dirty in supporting schools through the issues that have been identified here. Many of these schools are in the most challenging areas where there are large proportions of disadvantaged pupils.

      The unfair aspect of Ofsted is that much emphasis is given to historical data in RAISEonline rather than how effective learning is currently. However, it is a great document to look at historical trends over time and for the outcome variations across groups of learners. Even in the worst schools there is always a proportion of learners that achieve very well – it just isn’t enough. Generally, the reason why some schools are seen to be failing is because particular groups of pupils underachieve much more than others. The importance of data (which is never perfect and should never be seen as such) is that it highlights where the school’s greatest challenges lie. Current data (that gathered by the school) is now taking more of headline role in Ofsted inspections and therefore inspectors have to rely on the accuracy of this current data. Other than data the inspectors are relying too heavily on scrutinising the pupils’ work in books to make judgements on progress. Scrutiny of books does, however, provide a good indicator on the variability of the quality of teaching and learning in any school, but, as quite rightly pointed out, it does not provide the answers that are needed – much more needs to be followed up and triangulated to ensure that happens. Unfortunately, many current inspectors are not experienced enough in inspection practice to be relied on to make the right decisions – but this is another story.

      One of the additional issues for secondary schools this year (not yet mentioned), particularly in deprived areas where there is a high proportion of white working class children, are the 2015 grade boundary changes which meant that a good proportion of those pupils who would just scrape to a Grade C didn’t. The main group for concern here was the Level 4 border line pupils arriving from primary just missing out on the Grade C.

      Marking in pupils’ books is only a small part of the whole assessment process and should not be seen in isolation. The most effective assessment has always been in the classroom with the pupils themselves and incorporating high learning discussions about learning. These discussions, at their best, involve pupils’ self and peer assessment as well as teacher assessment. This practice has a significant impact on the learning that takes place in each lesson and involves much less written comments in the pupils’ books. This releases time for the teachers to really thinking about ‘tomorrow’s’ lesson plan and how it can build on the quality of learning that took place in the last lesson. This type of practice is what really does have a long term impact on the progress made by the pupils and the levels of attainment achieved by the time they leave school. In our most successful schools this practice begins in the early ‘pre-school’ years and continues thereafter.

      However, until secondary schools really do take the pupils’ individual learning profiles on from when they leave primary school they will continue to fail many pupils in KS3. This is a crucial time for pupils as they go through emotional changes. If they are bored in school, or do not feel respected as part-owners of their own learning they soon turn off from school and then, in many cases, there is no turning back and they will be ‘left behind’.

      Let’s think about the learning process – who owns that process and how we can then fully involve in it all of the owners. We can then concentrate on showing full respect to each other. As we all know, all owners need to contribute effectively if the learner is to achieve well and have a fulfilled time in all of our educational environments.

      Posted by Malcolm Greenhalgh | December 14, 2015, 10:12 am


  1. Pingback: ORRsome blog posts December 2015 | high heels and high notes - December 31, 2015

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Leadership: Being, Knowing, Doing (New Book)

Liminal Leadership


Enter your email address to follow this blog and receive notifications of new posts by email.

Join 32,089 other subscribers
Follow @LeadingLearner on WordPress.com

Blog Stats

  • 1,589,654 hits


%d bloggers like this: