Linking ITT and workforce data: a step in the right direction

I had the great pleasure of meeting Becky Allen back at the beginning of the year for a bit of a discussion about the work Education Datalab were doing on matching teacher training records to the School Workforce Census. I suspect a pretty monumental amount of effort has gone into nailing down the final details since then but two of the three linked reports are now published. I suggest you start here to either have a quick look at the key findings, or to access the full reports. So far I’ve just read the NCTL one.

It is immediately apparent that this is something the DfE ought to have done years ago. There is a lot of talk of evidence-based policy-making but any kind of genuine commitment to such a thing would have seen this sort of data-analysis set up prior to the seismic changes to ITT that have been implemented since 2010. Hey-ho; better late than never.

In theory this methodology could be used for a much longer-term project that might start generating some really useful data on the impact of various approaches to training teachers. It is easy to pick up this work and think it is limited to evaluating structural issues about ITT routes but if you consider the richness of a data set that can pretty much link every teacher in the maintained sector back to their ITT experiences, there is almost unlimited potential. Inevitably, for ITT providers, there is a pretty steady (and self-selecting) drift out of contact over the years after qualification. This work potentially solves that problem for research on any aspect of ‘what works’ in ITT. That’s something for the future; what of the findings here?

It would be tremendously easy for a lot of people in ITE to say “I told you so” in regard to the Teach First retention figures. Actually, I think the useful questions are more subtle than that but figures first. Using the lower-bound numbers, traditional HEI-led routes have about 60% of those initially recruited working as teachers in the maintained sector in their third year after qualifying. SCITTs are higher at 70% (but these would have been the early adopters). School Direct hasn’t been running long enough to have figures. Teach First is under 50%.

datalab retention graph

However, there are several things to remember about Teach First. Their qualifying year involves teaching potentially difficult classes, mostly in schools with more challenging behaviour, with variable levels of in-school/in-class support, whereas university-led trainee teachers are supernumerary, on lower timetables, and working in a wider range of schools, and rarely those in a category or Grade 3. Teach First are also possibly more likely to continue to work in more challenging schools although I think that is an assumption I would want to see data on because certainly some participants move from TF schools to schools at the opposite end of the socio-economic spectrum.

There are also a few things to remember about HEI-led courses. Financial survival, and the need to make up the numbers across all the shortage subjects, probably mean that in these subjects the HEI-led cohort has a longer tail than for any other route. SCITTs may have some of these pressures too but, particulary in the years for this report, are likely to have had the opportunity to be more selective. I suspect it’s the other way round for subjects like PE, English and history where the larger scale of HEIs generates a larger pool of applicants compared to SCITTs. Since shortage subjects make up the bulk of an HEI cohort, you would expect to have a lower qualification rate, and also some marginal grade 2s where support (or lack of it) in their employing school might determine success in their NQT year. As pointed out right at the beginning, the report can’t tell us anything about what would happen to the same trainee teachers if they were trained via a different route.

Teach First recruitment has been astonishingly successful. Having seen the marketing machine in action, and with access to funding that very few providers can match, that is perhaps not completely surprising but it has been terrific nonetheless. This means they probably have the strongest cohort of all at the start of training. For me, the critical question to ask is, if Teach First training was more like the HEI-led route, or a SCITT, would there be hundreds more high quality teachers still in the classroom. There is no way to tell from this report but, anecdotally, the Teach First participants I have worked with would all have had excellent outcomes on the HEI-led course or School Direct programmes I mainly work on. What I don’t know is whether they would have gone into teacher training at all.

If Teach First is mainly putting people who would never have tried teaching into struggling schools with teacher recruitment problems, to do a decent job for two or three years, then that is probably a justifiable use of public money; if they are putting potentially high quality, long-career teachers through training in a way that knocks an additional 10-20% off retention, that doesn’t look so good. I suppose there might be other benefits; I’m unconvinced by these but make up your own mind. Sam Freedman sets out the most positive case here.

What about the other findings?

  • Three regions of England – North East, North West and South West – appear to have large numbers of new qualified teachers who do not join a state-sector school immediately after achieving QTS.
    • This is pretty good evidence that the NCTL need to sort out the Teacher Supply Model, but that was already very apparent. We are waiting on tenterhooks for the announcement on allocation methodology (so presumably they are desperately trying to invent something at the moment; let’s hope they don’t make another almighty cock-up!
  • Those studying on undergraduate with QTS courses have low initial retention rates in the profession, though we cannot know whether this results from subsequent choices made by the individual or recruitment decisions made by schools.
    • They do, but the data also shows they catch up later. I suspect that if you have a B.Ed. sooner or later it becomes the best option for a professional career whereas PGCEs have their UG degree as an alternative option (depending on subject a bit)
  • Teach First has very high two year retention rates, but thereafter their retention is poorer than other graduate routes.
    • I’m hoping, perhaps in vain, that the move away from QTS  might link teacher development across from ITT into the first year(s) of post-qualification employment for other routes and get a bit of the 2-year TF programme effect into other routes.
  • Ethnic minority teacher trainees have very low retention rates.
    • I suspect because they are much more likely to have limited experience of the UK education system if educated abroad, and are also more likely to be EAL, both of which, in my experience, can affect classroom relationships. It would be enormously useful to have data that separates UK and non-UK educated teachers and drill down a bit. In my part of the world, UK-educated BME applicants are thin on the ground but I don’t notice anything that would lower their retention rate.
  • Individuals who train part-time or who are older have much poorer retention rates, which may simply reflect other family commitments that interfere with continuous employment records.
    • UoS doesn’t do part-time. I have a hunch that retention might actually be better for older trainee teachers on our Science PGCE – they do mostly need a proper job to pay mortgages whereas younger trainees often don’t have that commitment. On the other hand, whilst they are nearly all tremendous people to work with, developing into a good teacher is partly about developing habits that are effective in the classroom and I think changing habits gets harder as you get older. It’s also a very fast-moving environment when you are a novice and again I think adapting to this gets harder with age. They are quite often particularly good at developing relationships with teenagers though, so it’s swings and roundabouts, maybe.

So those are my first thoughts. I think we have some way to go to get stable and effective initial teacher education that is structurally sound and therfore with the potential for continuous improvement. NCTL have tried quite hard to break what we had; now we need to take the best of the many pieces and put them back together again, hopefully to end up with something better than before. High quality evidence is a key part of this process, as are people in high places that are prepared to pay attention to it. This report is a very important step in the right direction.



Diet (Effect size = 0.12)

This post is part of a series looking at the influences on attainment described in Hattie J. (2009) Visible Learning: a synthesis of more than 800 meta-analyses relating to achievement. Abingdon: Routledge. The interpretation of Hattie’s work is problematical because the meaning of the different influences on achievement isn’t always clear. Further context here.

Working my way through the influences I’ve skipped a few that didn’t look terribly interesting, had low effect size, or had nothing to do with what happens in schools but I have had a little look at Diet (Effect size = 0.12) because I am surprised this is so low. At the college where I used to work, which was in a typical deprived coastal-urban setting, we had plenty of students who hadn’t been terribly successful at GCSE and were doing Level 1 and 2 courses to try to improve their qualifications. Amongst this group it wasn’t unusual to find that a student’s breakfast had been 1.5 litres of Coke and a Monster, which I always found pretty stunning. I think I would rather have White Russian on my Cornflakes than have to face drinking that lot in the morning!

I’ve tended to go along with the general opinion that Coke and Red Bull is likely to have a significant effect on learning performance and have this vague memory of various studies having shown that a balanced diet and a proper low GI breakfast leads to significantly better concentration during the school day. That certainly seems to be the opinion of leading nutritionists successful chefs appointed as government advisers. However I’m not sure that proper scientists would agree that the caffeine was a major problem. On the other hand, blood glucose levels and/or particular additives or nutrients might be a different matter.

I work quite closely with Professor Marcus Grace who, as well as tutoring on the Secondary Science PGCE at Southampton, is one of the significant figures involved in the LifeLab project. I really ought to get round to asking him about this – there is so much research expertise in the School of Education Studies that I need to work on tapping into! When I get round to doing that I’ll update this post; meanwhile, what evidence is Hattie basing his d=0.12 on?

There is one meta-analysis, Kavale and Forness (1983). I can only access the abstract but it’s clear that despite the missing clause in Hattie’s summary, the meaning that I had assumed he intended does match this meta-analysis. Equally it is clear that this is very specifically looking at ADHD and not children without this diagnosis. Essentially this paper states that the studies analysed do not provide evidence to support the earlier hypothesis that dietary changes could have a positive effect on ADHD symptoms. I’m guessing that the outcome measure was not academic achievement, but more likely some behavioural measure, which reminds me again that Hattie seems rather blasé about what his meta-analyses are measuring.

A quick trawl for more recent work suggests to me that things may have moved on, with this Schab and Trinh (2004) meta-analysis dealing only with double-blind, placebo-controlled trials getting d=0.21-0.28. Again there is this issue of whether or not Hattie’s 0.40 average effect size is the correct bar for comparison. With double-blind, placebo-controlled trials, it shouldn’t be. The methodology ought to make the inherent effect of the intervention zero and these authors are clear that their meta-analysis does show that artificial food colours affect ADHD. Having said that, when the trials were separated into groups according to who was reporting on the effects, teachers couldn’t detect any difference in behaviour but parents could. That’s not parents’ wishful thinking because of the double-blind; it might have rather more to do with the difficulty kids have in shifting their teachers’ expectations. Stevens et al. (2011) is a review of the literature, including both the meta-analyses mentioned above. They reach a similar conclusion but picking up the suggestion in Shab and Trinh that the effect might be restricted to only a proportion of children with an ADHD diagnosis (10%-33%). However the Bateman et al. (2004) study on the Isle of Wight involving Southampton academics and a further study (and a smaller one from the USA cited on p.286 in Stevens et al.) suggest quite strongly that artificial food colourings affect all children (well – young ones at least).

Since writing this post I’ve come across this Harvard Mental Health Letter reviewing the relationship between diet and ADHD. It includes the findings from the Schab and Trinh (2004) meta-analysis but also some other research. The conclusions are similar – that some artificial food colourings do seem to have an effect on at least a proportion of children, which probably means that reducing exposure is a good thing. It also suggests that increasing Omega-3 essential fatty acids and micronutrients might just help too. A final point is that the research on the effect of sugar on behaviour suggests there is no link (but of course the link with obesity and Type II diabetes is only too obvious). But the strongest message is that the usual recommendations for a healthy diet apply to all children.

Anyway, this isn’t something for day-today teaching. There are all sorts of issues around ADHD (like whether it is a useful diagnosis, whether drug treatments are a good idea, and so on) and even if all children are susceptible to artificial food colourings it’s possibly something teachers might helpfully be aware of but it isn’t going to affect what we do in our classrooms. I again find myself wishing that Visible Learning was narrower in its breadth and deeper in it’s depth but it’s been an interesting evening educating myself. Next, I’m going to jump to Time on Task (Effect size = 0.38) because I want to look at this in relation to a paper by Professor Daniel Muijs (another big hitter from the Southampton School of Education Studies) that suggests Time on Task is one of the most important influences on achievement.

Index to series on Hattie’s Visible Learning

This post is just a quick reference index to my series of posts looking at the influences on attainment described in Hattie (2009) Visible Learning: a synthesis of more than 800 meta-analyses relating to achievement. Abingdon: Routledge.

The interpretation of Hattie’s work is problematical because the meaning of the different influences on achievement isn’t always clear. Further context here. There are also some significant issues with Hattie’s methodology but despite these shortcomings, Visible Learning remains as the boldest attempt to draw together all areas of education research.

The list below shows my posts in the order they appear in Visible Learning. I have only looked at some influences, skipping those that I thought to be self-explanatory, outside the influence of teachers, or inconsequential.


Piagetian programs (d=1.28)

Self-reported grades (d=1.44)

Concentration, persistence, and engagement (d=0.48)

Diet (d=0.12)


The intention is to have the index presented in two forms:

The second will be taken directly from the list of influences in rank order of effect size in Appendix B of Visible Learning pp.297-300 (but I haven’t copied that out yet – waiting for an evening when my brain is too fried for anything less mechanical!)

Concentration, persistence and engagement (Effect size = 0.48)

This post is part of a series looking at the influences on attainment described in Hattie J. (2009) Visible Learning: a synthesis of more than 800 meta-analyses relating to achievement. Abingdon: Routledge. The interpretation of Hattie’s work is problematical because the meaning of the different influences on achievement isn’t always clear. Further context here.

Following on from the big effect sizes for some of the influences listed under the heading of Background, like Piagetian programs and Self-reported grades, there are a series of low to medium effect sizes under the heading Attitudes and Dispositions. Mostly I am ignoring these because correlations between achievement and things like personality traits and self-concept don’t give much for a teacher to work on. All the recent focus on this, stemming from Duckworth’s Grit, and Dweck’s Mindset, only matters to teachers if there is some good evidence that we can shift children along these scales and that’s definitely not what most of these categories are about.

However, I thought it worth a closer look at Concentration, persistence and engagement (Effect size = 0.48) because this sounds like it is really very close to that Grit and Mindset work. Now Grit is a personality trait – psychology rather than education. But Mindset is definitely in the education realm with a proprietary programme and lots of related initiatives. The research on attempts to shift childrens’ mindset looks quite promising (this is a good summary) but my impression is that quite a bit of it is not truly independent. That hasn’t prevented its, quite understandable, enthusiastic adoption by some schools, though, so it will be interesting to see the outcome of the EEF funded project being run in Portsmouth, in Spring 2015.

Given that the research base for Mindset dates from 1990, you might think it featured in this section on Concentration, persistence and engagement but I’m not aware of any meta-analysis so for that reason it wouldn’t feature in Visible Learning. However, it seems so close to the title of this section that, within the kind of broad-brush approach Visible Learning takes, the effect size of 0.48 might tell us something about the likely impact of becoming a growth mindset school.

Unfortunately, the meta-analyses referenced by Hattie don’t really tell us very much about the potential effect of increasing concentration, persistence, or engagement. Kumar (1991) looked at the correlation between different teaching approaches (in science) and student engagement. Now student engagement might be a good thing but, as Hattie points out in his commentary “we should not make the mistake…of thinking that because students look engaged…they are…achieving”. And Kumar has nothing to say about achievement in this meta-analysis. Also, although there was quite a big range of correlations (0.35 to 0.73) across the different teaching approaches, the probability of these differences being random is too high to claim statistical significance at a reasonable level – the perennial problem of typical sample sizes in education research. Datta and Narayanan (1989) were looking at the relationship between concentration and performance, but in work settings; maybe that’s transferable, but maybe not. Equally, Feltz and Landers (1983) were looking at mental visualisation of motor tasks so, apart from subjects like PE, dance, and possibly D&T I cannot see the relevance to teaching. Finally Cooper and Dorr (1995) looked at whether there was a difference between ethnic groups, which again doesn’t tell us anything about how we might improve achievement, particularly since there was little difference found. There is one more meta-analysis in the synthesis although it doesn’t feature in Hattie’s commentary; this is Mikolashek (2004). This was a meta-analysis of factors affecting the resilience – I think actually normal academic success as a proxy for resilience – of at-risk students. The abstract seems to suggest that internal and family factors are significant but, again, there is no measurement of the effect of anything a teacher might do to enhance these.

Looking at the overall picture here I think Hattie has pushed the envelope too far. One of the criticisms of meta-analysis is the danger of amalgamating studies that were actually looking at different things e.g. oral feedback, written feedback, peer feedback. I think it’s fine to lump all feedback together if measured by roughly the same outcome, provided this limitation is made clear. The next stage might be to unpick whether all forms of feedback are equally effective but unless it’s clear that one form is something like 0.20, another 0.60, and the third 1.00 (average Effect Size = 0.60) during the initial analysis, knowing that feedback is worth a more detailed look seems helpful. However, for this influence I think the ‘comparison of apples and oranges’ charge is justified criticism. The five meta-analyses are all looking at different things, in different contexts, and with several different outcome measures. I cannot see the value in averaging the effect sizes and am starting to wonder how much more of this I’m going to find as I continue to work through the book. Diet interventions (Effect size = 0.12) is next – which dietary changes, I wonder?





Self-reported grades (Effect Size = a whopping 1.44)

This post is part of a series looking at the influences on attainment described in Hattie (2009) Visible Learning: a synthesis of more than 800 meta-analyses relating to achievement. Abingdon: Routledge. The interpretation of Hattie’s work is problematical because the meaning of the different influences on achievement isn’t always clear. Further context here.

Following my post on Piagetian programs (effect size = 1.28) comes the top-ranked influence of Self-reported grades (effect size = 1.44). Until now I’ve been assuming that if you take one group of students who say they are working at A-grade standard, and another who say they are working at C grade standard, then you find that, sure enough, the ones self-reporting A grades are achieving more highly. Hattie implies that if self-reported grades are very accurate then there is less need for testing but my thinking on this is that there is good evidence that low-stakes testing is an effective method for improving recall, and replacing high-stakes testing with self-reported grades isn’t going to happen any time soon.

What I have been wondering is how much of this self-reporting of grades is predictive; if the two groups of students are actually working at the same current level but one group declare themselves A grade students and the other group declare themselves C grade, maybe that becomes a self-fulfilling prophecy. In a moment of self-delusion I’ve even used this as a piece of evidence supporting the importance of setting challenging learning objectives – hopefully my other evidence excuses this slip.

So, back to Hattie’s evidence then. I’m afraid the only way to report this is to go through the individual meta-analyses: Kuncel, Credé and Thomas (2005) were looking at the validity of self-reported Grade Point Averages. It’s not toally clear to me quite how GPAs work in the USA but I think this would be kind of the same as asking graduates in the UK what their final percentage mark was for their degree. The point of this meta-analyses is to try to establish the validity of researchers asking for GPA rather than getting it from a transcript of some sort so I don’t think this has any relevance to teachers – it’s just about whether people remember accurately and whether or not they lie.

Falchikov and Goldfinch (2000) were looking at the validity of peer marking compared to teacher marking, at undergraduate level: they found a high level of correlation. This study also reports the findings from Falchikov and Boud (1989), which are similar. Mabe and West (1982) found a low correlation between self-evaluation, and other measures of performance. The range of studies they lookd at was really broad including academic, clerical, athletic performance. It’s a psychology study so of course most subjects were, again, university undergraduates. Finally Ross (1998) found pretty variable levels of self-assessment in those learning a second-language. There is a vague theme running through these studies that novices are worse at self-assessment than more experienced learners in a paticular area.

I think the only useful thing that comes out of this for teachers is that, with capable students, it may be possible to do quite a bit of peer-marking and self-assessment, to ease the workload of teacher marking, if what you are after is marks for your markbook (none of this evidence says anything about any other aspect of feedback). Perhaps the very limited relevance of this influence is why it isn’t mentioned anywhere in Visible Learning for Teachers but it does seem odd that it gets Rank 1 and then is completely ignored.

The rest of the list of influences brought by the student doesn’t seem terribly interesting. Either these are things that teachers have no control over – like pre-term birth weight – or they would be much more interesting if looked at in terms of the effect on trying to change something. For example, Concentration/persistence/engagement (effect size = 0.48) appears important but all the recent focus on this, stemming from Duckworth’s Grit, and Dweck’s Mindset work, only matters to teachers if there is some good evidence that we can shift children along these scales. I’ll have a little look at this one in case there is something interesting lurking in there but otherwise it might be time to move on to school effects, starting with Acceleration (effect size = 0.88) and the behaviour management effects, in particular what the difference is between Classroom management (effect size = 0.52), Classroom cohesion (effect size = 0.53), Classroom behavioural (effect size = 0.88), and Decreasing disruptive behaviour (effect size = 0.34), and what the research says about Peer influences (effect size = 0.53).

Piagetian programs: effect size = 1.28

Hattie states that the one meta-analysis for this influence found a very high correlation between Piagetian stage and achievement (more for maths 0.73 than reading  0.40). Quite what is meant by this isn’t clear. I’m guessing that some sort of test was done to determine the Piagetian stage and the correlation is between this and achievement. Piaget’s original theory suggests that the stages are age-related but later work has criticised this part of the theory – he did base his theories a lot on the development of just his own children – so presumably the research behind this meta-analysis was based on the idea that children made the breakthrough to a new stage at different ages, and that those who reached stages earlier, might achieve more highly. If I remember correctly, the CASE and CAME programmes (and Let’s Think! for primary) were designed to accelerate progress through the Piagetian stages – from the concrete to the formal-operational stage in the CASE and CAME programmes) and there is some evidence that all these programmes have a significant effect including a long-lasting influence on achievement not only in science but spilling over into English, and several years later at that. Maybe these would count as Piagetian programmes.

So that’s my starting point but what does the Jordan and Brownlee (1981) meta-analysis actually deal with? Well, at the moment all I can find is the abstract:

The relationship between Piagetian and school achievement tests was examined through a meta-analysis of correlational data between tests in these domains. Highlighted is the extent to which performance on Piagetian tasks was related to achievement in these areas. The average age for the subjects used in the analysis was 88 months, the average IQ was 107. Mathematics and reading tests were administered. Averaged correlations indicated that Piagetian tests account for approximately 29% of variance in mathematics achievement and 16% of variance in reading achievement. Piagetian tests were more highly correlated with achievement than with intelligence tests. One implication might be the use of Piagetian tests as a diagnostic aid for children experiencing difficulties in mathematics or reading.

I have made a few enquiries and will update this post if I get hold of the full text but it seems quite close to my assumption that it’s about a correlation between tests of Piagetian stages and achievement. I don’t think that’s of any direct use since it doesn’t tell us anything about how we accelerate progression through the stages. On the other hand, if we know that there is a good correlation between Piagetian stage and achievement, and if it transpires that it is possible to change the former, and that this does have a casual effect on the latter, then we would perhaps be cooking on gas.

Where does CASE, CAME, and Let’s Think! come into this? Well, these Cognitive Acceleration (CA) programmes cannot be relevant to this influence, as classified by Hattie, because the first paper on CASE was published in 1990 and the meta-analysis used by Hattie for this influence labelled Piagetian programs dates from 1981. However, as well as the evidence for the effectiveness of these CA programmes from those involved in developing them, they were included in a meta-analysis on thinking skills Higgins et al (2005), which Hattie has made use of. Where do you think this is found? Not under Piagetian programs; not under Metacognitive strategies; no, I don’t think you’ll guess – under Creativity programs (Effect Size = 0.65). I would instinctively have though Creativity programs was something in the Ken Robinson mould. Instead Hattie is picking up a collection of specific curriculum programmes based around clearly stated things to be taught, and particular ways to do the teaching, that emphasise the explicit development of thinking strategies. And buried in here are some very high effect sizes.

I actually taught CASE (without proper training, I’m afraid) for a year, whilst doing a maternity cover about ten years ago. I thought it was pretty good at the time but if the effect sizes hold up (the EEF have a Let’s Think Secondary Science effectiveness trial underway that will report in 2016) then we should probably be thinking about making this a pretty integral part of science and maths teaching. If anyone is looking for access to the programmes then it’s organised by Let’s Think.

Probably the final point on all this is that I’ve started this post with a title that includes Piaget, whose theory on cognitive psychology is a primary source of justification for the whole constructivist teaching movement. And I’ve ended up talking about a programme directly drawing on his theory that appears to have an effect size at least comparable to Direct Instruction. Should the new-traditionalists be worried? No more than is justified. CASE has at least as much in common with Direct Instruction as it does with Problem-based Learning, and although it includes significant amounts of peer discussion it is definitely teacher-led. I continue to argue my case that teachers should be in charge of learning, but that we shouldn’t throw the quality learning baby out with the constructivist bath-water.

Next, Self-reported grades (Effect Size = a whopping 1.44)

Looking More Closely at Visible Learning

A somewhat careless comment on Andrew Smith’s blog (which he responded to with a clear demonstration that he knew more about Hattie’s work than I do) has led me back to the original Visible Learning: a synthesis of over 800 meta-analyses relating to achievement. There are a whole bunch of issues with Hattie’s methodology, which are probably fairly well-known by now e.g. David Weston’s ResearchEd 2013 talk; Learning Spy’s post which is related to Ollie Orange’s . I’ve tried to summarise these for my own clarity. If you read the introduction to Visible Learning, or churn your way through Visible Learning for Teachers, it’s pretty clear that Hattie is conscious of at least some of the limitations of his work (maybe not some of the statistical issues, though). In some ways Andrew is bucking the trend in education at the moment – a few years ago Hattie was definitely the most prominent researcher in the field of education but his star has undoubtedly waned. For a while there, he really was The Messiah, but that wasn’t his fault, more a consequence of being responsible for some important evidence at just the moment that the deep-water swell of evidence-based practice felt bottom and started to build. At first surfers flocked to, and eulogised, Hattie’s miraculous surf break but when it turned out to not be as smooth, glassy and regular as they hoped, and other surf spots were discovered, it almost inevitably fell from favour somewhat.

As Hattie himself points out, any attempt to just look at the headline effect sizes and conclude “this works, that doesn’t” is not only misinterpreting his work, but missing the point. His approach is to take the huge mass of evidence and use it to draw out themes that really do tell us something about how to teach more effectively, but always to appreciate that this must be in the context of our own teaching, our own students, and our own settings.

However, I think there is another barrier to making effective use of Hattie’s work. I think I’ve been aware of it for a while but the recent brief exchange with Andrew Smith has highlighted it for me. Interpretation of Hattie’s work is problematical because the meaning of the different influences on achievement isn’t clear. I first encountered Hattie’s work through the Head of History at the college where I worked. He was a fantastic teacher and had been significantly influenced by Geoff Petty’s book Evidence Based Teaching which in turn was heavily influenced by Visible Learning. I think Petty made a pretty decent stab at interpreting Hattie’s work but I also think he was influenced by some of his own ideas about effective teaching (Teaching Today pre-dates Visible Learning and I think shows that he didn’t take on board all the evidence from Visible Learning when he read it) and there are points where he freely admits to basically taking an educated guess at what some of Hattie’s influences actually refer to.

So having gone on to read quite a lot online about Hattie’s work, and continuing to encounter the same issue, I keenly started out on Visible Learning for Teachers, and was enormously disappointed with it. Expecting non-technical clarification and additional detail about the meta-analyses, instead it is an attempt to leave all the detail behind and draw some conclusions about the implications for teachers. A worthy aim, but a good couple of hundred pages longer than necessary; it reminded me of Jane Eyre!

It wasn’t long after reading this that the methodological issues with Visible Learning started to be spoken of more prominently and although I have continued to use the list of effect sizes as a kind of quick reference to support some ideas about effective teaching, I’ve more-or-less left it at that. So the video posted on Tom Sherrington’s blog over the summer blew me away somewhat – here was the clear, coherent message that was missing from Visible Learning for Teachers. Subsequently, and spurred on by my recent error, I’ve gone back to the original Visible Learning. I really see no reason why Hattie thought that teachers needed this interpreting; it’s not very technical and the introductory and concluding chapters draw the threads together at least as well as anything in the Teachers’ version. That fundamental issue still remains though, that for at least some of the influences, the meaning is hazy. On the other hand, the references are clear, and working at a university I am lucky enough to have unobstructed access to many of them.

It’s therefore time to do some reading, and sort out the nature of the influences that remain unclear to me. My plan is to take each influence in order from Visible Learning and do just enough to feel confident of the meaning. I’m hoping for most influences this will just involve reading the relevant page or two from Visible Learning (a lot are very clear) but for some I expect to need to go back to the most prominent original meta-analysis to see what it was actually about. I’ll let you know how I get on but Hattie starts with the section on Contributions from the Student: Background. Prior achievement (Effect Size = 0.67) is clear enough but Piagetian programs (Effect Size = 1.28) is not (I had assumed this was things like CASE and CAME – which have been shown to be very effective – so that shows how much I need to do this reading). I can’t make much sense of Hattie’s paragraph on this so, here we go. I’ll let you know how I get on.