Graded Lesson Observations: Defibrillation or a Stake through the Heart?

An observer enters your classroom. Is this person your HoD, the assistant head with responsibility for T&L, an Ofsted inspector, or a demon who has occupied a corpse and is coming to suck your blood? A fair number of commentators have recently suggested the latter and have been sharpening words, and presumably a variety of sticks, with a view to dispatching said vampires to the demon dimensions. Like Rupert Giles, Robert Coe from Durham University CEM (possibly a pseudonym for the Watchers Council) has been quietly dispensing the wisdom of the ancients academics, guiding the Slayers in their quest. But is the graded lesson observation really the personification of evil, or does it have a soul worth saving?

Wilshaw’s Westminster Education Forum speech on 7th November 2013 included the line: “Which ivory towered academic, for example, recently suggested that lesson observation was a waste of time – Goodness me!” Does Wilshaw need to pay more attention to the ivory towered ones? Is his organisation trying to perform a task as fundamentally uncertain as measuring the combined momentum and position of a sub-atomic particle; is it engaged in a legitimate assessment technique but doing it in a slightly crap way; or is the Ofsted Christmas party actually a masquerade ball of orgiastic hedonism where innocent teachers are dragged to be ripped assunder in a feeding frenzy of unimaginable gore?

In ITT, observations are a big part of how we assess the progress of trainees. It doesn’t feel as though the judgements we make are unreliable; over the course of a number of observations, we would feel confident that an accurate picture of a trainee’s teaching was being drawn. Are we deluding ourselves when we reflect on this practice; are we even capable of reflection…

If you pick up Robert Coe’s blog entry on this you’ll see that he is linking to two pieces of research. The first is the massive (and massively well-funded – thanks Bill & Melinda) MET project. Now, I make no claims to either the academic clout of Robert Coe, or to expertise in this area, but reading the MET policy and practice brief  I can see where Coe’s figures are coming from, but not his conclusion that observations are unreliable to the point of worthlessness as a measure of teacher performance. The MET project seems to me to be making suggestions about how to improve the reliability of observations not concluding that they are good only for a staking. Of course, like Wilshaw, anyone involved in a project called “Measuring Teacher Effectiveness” may be somewhat biased towards the idea that it is actually possible to measure such a thing, and continued research funding may even depend on that outcome, but the MET project is looking at a range of ways to measure teacher effectiveness and I can’t see why, if they were looking at data that suggested observations were a waste of time, they wouldn’t say so and recommend a system based on other measurement methods.

Strong, Gargani & Hacifazlioğlu (2011) is the other piece of research. It’s behind a paywall but for good papers there’s often an academic somewhere that has breached their institutions copyright rules and posted it somewhere helpful. In interpreting the results, it’s important to appreciate that of the three experiments, two involved judging teachers on the basis of two minute clips of whole-class teaching (chosen to avoid any behavioural management incidents!). However, the third experiment did involve observations of videos of whole lessons, but using a complex observational protocol – the CLASS tool – that seems to weight student engagement and various other, dare I say it, constructivist ideals quite strongly. Coe is right to state that the ability of observers to pick good teachers in these experiments was in the same league as Buffy’s ability to pick good boyfriends but he leaves out at a crucial point which I think I’d better quote.

This analysis showed that a small subset of items produced scores that accurately identified teachers as either above or below average. All of these items were from the instructional domain. They included clearly expressing the lesson objective, integrating students’ prior knowledge, using opportunities to go beyond the current lesson, using more than one delivery mechanism or modality, using multiple examples, giving feedback about process, and asking how and why questions.

The final point made in the paper is that “This… has motivated us to undertake development of an observational measure that can predict teacher effectiveness.”

So I’m not sure that Coe has it right on this evidence. Yes, we all (ITT, Ofsted, and school leaders) need to recognise that sloppy observation procedure and training will lead to meaningless judgements. Yes, using graded observations for staff development may be a bit like burning witches to improve their chances at the last judgement. Yes, value-added data may be a better, or even the best, method for judging the effectiveness of a teacher and/or their teaching. But, in ITT where value-added data does not exist, I think my colleagues and I really ought to be bringing some of the academic clout of our Faculty to bear on using research like this to develop a model for lesson observation that delivers reliable outcomes. I’ll let you know how we get on, and give you a shout if we need any stake holders.


One thought on “Graded Lesson Observations: Defibrillation or a Stake through the Heart?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s