When Is a Trend, Not a Trend (2nd attempt)

Most of us think we can spot a trend in school data when we see one but increasingly I’m not so sure. The problem is not that trends don’t exist; some schools will genuinely be improving and others declining but, at the moment, most people in education tend to accept that schools have blips in their data for reasons that are almost completely random but that looking at data over several years gets past this problem. Reaction to the Cramlington Learning Village Ofsted report (Outstanding to Special Measures) is a good example.

The suggestion here is that if the data gets worse for four years in a row then that is a genuine trend, caused by something that the school is doing wrong.

But here are two alternative thoughts:

(1) Each year, there may be things outside a school’s control which can have a random, on-going effect on 5A*-CEM. These might include changes in key staff, changes in relative reputation affecting parental school choices, demographic changes in the catchment, etc. Sometimes these will push results up and sometimes they will bring them down.

About once a year I give myself a little VBA project. I am totally crap at writing code but I enjoy the challenge (and the immediate, if often entirely unhelpful feedback, when it throws errors!) This year, I have taken a spreadsheet with 26 identical ‘schools’ with a 56% 5A*-CEM score and applied an algorithm that randomly allocates a small change to 5A*-CEM. The changes are randomly chosen from a normal distribution with an SD of 2.5%. This means that 0% is the most likely change, with lots of small changes of 1-2% and an occasional bigger one.

This red and green graph is the outcome.

schooldatagraph

Over ten years, the red school data falls pretty steadily from the national average of 56% 5A*-CEM; what do you think Ofsted would make of this? And the green? Well it’s not as dramatic as the improvement at Huntington School but I think most headteachers would give their right arm for this data. If you want to have a go for yourself then here is the Excel spreadsheet

wholespreadsheet

You’ll need to enable macros when you open it, then click the RESET button, and then click the ADD 1 YEAR… button.

I posted this a few weeks back and Jack Marwood very kindly responded and made some suggestions. I think my first spreadsheet still illustrates my original point but I’ve now done another based on his suggestion.

Jack’s point is the opposite of my original idea in some ways. If you assume that absolutely nothing changes about the quality of teaching, leadership, and also absolutely nothing changes about the external factors etc. then from one year to the next you would get the same 5A*-CEM results, wouldn’t you? No! Because it’s not the same children – cohorts are independent.

Take a big bag of mints – 56% black and 44% white. Reach in and grab a handful, then grab another handful; these are independent samples. That’s what cohorts are like if the population they come from doesn’t change. What does that do to 5A*-CEM results?

My updated illustration starts with 2000 children with 56% getting 5A*-CEM. Then a cohort of 200 are selected at random to be this year’s Y11. Over ten years the results jump about but almost always revert to the mean, but what if you look at the kind of period we tend to consider in education – like 4 years?

schooldatagraph2So, would you want to be the school with the red data, or the green?

Again, feel free to access the spreadsheet here – same instructions as for the other one. You don’t get strong ‘trends’ every time but then there are only 26 ‘schools’ – I’m not suggesting that every school’s data trends are random effects.

And of course, this is still incredibly simplistic. Whatever you think of Hattie’s Visible Learning, most teachers would probably agree with his assessment that peer influences are important (Effect size = 0.53); this would produce bigger variation between cohorts and stronger ‘trends’ (as ‘bad’ cohorts dragged each other down and ‘good’ ones pushed everyone up) but wouldn’t make them more common. More subtly, a particularly good or bad year might affect expectations, and therefore outcomes, in the following years. That would make ‘trends’ more likely. The different models on my two spreadsheets could both be operating. Who knows what else?

If you take a large number of schools, in some RCT or school effectiveness research project, then these sorts of random ‘trends’, and other data problems can be coped with; but if you are trying to judge an individual school on it’s data, even over several years…

And trying to judge the effectiveness of individual teachers presents the same problems.

Best wishes