Self-reported grades (Effect Size = a whopping 1.44)

This post is part of a series looking at the influences on attainment described in Hattie (2009) Visible Learning: a synthesis of more than 800 meta-analyses relating to achievement. Abingdon: Routledge. The interpretation of Hattie’s work is problematical because the meaning of the different influences on achievement isn’t always clear. Further context here.

Following my post on Piagetian programs (effect size = 1.28) comes the top-ranked influence of Self-reported grades (effect size = 1.44). Until now I’ve been assuming that if you take one group of students who say they are working at A-grade standard, and another who say they are working at C grade standard, then you find that, sure enough, the ones self-reporting A grades are achieving more highly. Hattie implies that if self-reported grades are very accurate then there is less need for testing but my thinking on this is that there is good evidence that low-stakes testing is an effective method for improving recall, and replacing high-stakes testing with self-reported grades isn’t going to happen any time soon.

What I have been wondering is how much of this self-reporting of grades is predictive; if the two groups of students are actually working at the same current level but one group declare themselves A grade students and the other group declare themselves C grade, maybe that becomes a self-fulfilling prophecy. In a moment of self-delusion I’ve even used this as a piece of evidence supporting the importance of setting challenging learning objectives – hopefully my other evidence excuses this slip.

So, back to Hattie’s evidence then. I’m afraid the only way to report this is to go through the individual meta-analyses: Kuncel, Credé and Thomas (2005) were looking at the validity of self-reported Grade Point Averages. It’s not toally clear to me quite how GPAs work in the USA but I think this would be kind of the same as asking graduates in the UK what their final percentage mark was for their degree. The point of this meta-analyses is to try to establish the validity of researchers asking for GPA rather than getting it from a transcript of some sort so I don’t think this has any relevance to teachers – it’s just about whether people remember accurately and whether or not they lie.

Falchikov and Goldfinch (2000) were looking at the validity of peer marking compared to teacher marking, at undergraduate level: they found a high level of correlation. This study also reports the findings from Falchikov and Boud (1989), which are similar. Mabe and West (1982) found a low correlation between self-evaluation, and other measures of performance. The range of studies they lookd at was really broad including academic, clerical, athletic performance. It’s a psychology study so of course most subjects were, again, university undergraduates. Finally Ross (1998) found pretty variable levels of self-assessment in those learning a second-language. There is a vague theme running through these studies that novices are worse at self-assessment than more experienced learners in a paticular area.

I think the only useful thing that comes out of this for teachers is that, with capable students, it may be possible to do quite a bit of peer-marking and self-assessment, to ease the workload of teacher marking, if what you are after is marks for your markbook (none of this evidence says anything about any other aspect of feedback). Perhaps the very limited relevance of this influence is why it isn’t mentioned anywhere in Visible Learning for Teachers but it does seem odd that it gets Rank 1 and then is completely ignored.

The rest of the list of influences brought by the student doesn’t seem terribly interesting. Either these are things that teachers have no control over – like pre-term birth weight – or they would be much more interesting if looked at in terms of the effect on trying to change something. For example, Concentration/persistence/engagement (effect size = 0.48) appears important but all the recent focus on this, stemming from Duckworth’s Grit, and Dweck’s Mindset work, only matters to teachers if there is some good evidence that we can shift children along these scales. I’ll have a little look at this one in case there is something interesting lurking in there but otherwise it might be time to move on to school effects, starting with Acceleration (effect size = 0.88) and the behaviour management effects, in particular what the difference is between Classroom management (effect size = 0.52), Classroom cohesion (effect size = 0.53), Classroom behavioural (effect size = 0.88), and Decreasing disruptive behaviour (effect size = 0.34), and what the research says about Peer influences (effect size = 0.53).

5 thoughts on “Self-reported grades (Effect Size = a whopping 1.44)

  1. Interesting analysis. Assessment and self assessment are interesting areas. I experiemented with removing grades or marks altogether and made it so students could only reflect on what they achieved. It worked very well after the initial “So what grade did I get” question. What was interesting was the resulting peer reviews where they were asking each other not how many marks they got but what others had put in their answer that they had missed.

    I have put the approach and method together in an article. You may be interested in reading it. Comments always welcome.

    Assessment Without Levels. Is it Possible?

  2. Thank you for an excellent analysis of self report grades, I always felt there was a lot of doubt to Hattie’s conclusions, your post helps helps to give evidence that Hattie conclusions are even more in doubt.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s