Monday, November 11, 2013

Don't Grade the Smart Person's Test First! (TFS part 3)

I am reading Thinking, Fast and Slow, by Daniel Kahneman. In this series I will summarize key parts of the book and supply some comments and reflections on the material.

Part I: Two Systems
Chapters 4-7

Priming is important.

Q: How do you make people believe in falsehoods?
A: Frequent repetition, because familiarity is not easily distinguished from truth.

Q: How do you write persuasive messages?
A: Make them legible, use simple language (this increases the perceptions of credibility and intelligence!), make them memorable (e.g. verse, mnemonics), and use easy-to-pronounce sources.

The more often something happens, the more normal it seems.

Confirmation bias, the halo effect, overconfidence, discounting missing evidence (the "what you see is all there is" bias), framing effects, and base rate neglect are all important psychological tendencies people should be aware exist.

My Thoughts:

These systematic biases are very important and prominent in many peoples' behavior. They are the bread and butter of intro psych and becoming aware of them (and learning how to correct for them) was the most significant thing I got out of taking psychology as an undergrad. 

The best anecdote in this section is Kahneman's discovery that he wasn't grading his students' exams fairly. He originally graded his exams in the conventional way -- one student's exam at a time, in order. Here's the excerpt:
I began to suspect that my grading exhibited a halo effect, and that the first question I scored had a disproportionate effect on the overall grade. The mechanism was simple: if I had given a high score to the first essay, I gave the student the benefit of the doubt whenever I encountered a vague or ambiguous statement later on. This seemed reasonable. Surely a student who had done so well on the first essay would not make a foolish mistake on the second one! But there was a serious problem with my way of doing things. If a student had written two essays, one strong and one weak, I would end up with different final grades depending on which essay I read first. I had told the students that the two essays had equal weight, but that was not true: the first one had a much greater impact on the final grade than the second....
So, he tried grading the exams one question at a time rather than one student at a time.
Soon after switching to the new method, I made a disconcerting observation: my confidence in my grading was now much lower than it had been. The reason was that I frequently experienced a discomfort that was new to me. When I was disappointed with a student's second essay and went to the back page of the booklet to enter a poor grade, I occasionally discovered that I had given a top grade to the same student's first essay. I also noticed that I was tempted to reduce the discrepancy by changing the grade that I had not yet written down, and found it hard to follow the simple rule of never yielding to temptation. My grades for the essays of a single student often varied over a considerable range. The lack of coherence left me uncertain and frustrated....
The procedure I adopted to tame the halo effect conforms to a general principle: decorrelate error! ... To derive the most useful information from multiple sources of evidence, you should always try to make these sources independent of each other. 
I am a huge proponent of grading by a rubric in addition to grading one question at a time as much as possible in order to avoid exactly this issue. Grading with a rubric increases consistency and fairness and decorrelates error. Sometimes teachers grade the smart person's test first and using it as a key rather than making their own key with rubric. I disapprove of this (as tempting as it is to do it) -- it saves time, but as Kahneman points out, there is probably a big halo effect here.

Why don't more lecturers grade with a rubric? It takes much longer to grade this way and it is much more tiring because you are actually evaluating all the questions and everyone's responses equally. It also makes assigning the final grades more difficult because the scores are not as obviously "separated" into nice groups. Many lecturers just assign grades based on where the obvious breaks in the scores are without realizing they have created those breaks themselves from biased grading of perceived smart students' tests and perceived dumb students' tests. The halo effect.

Creating the rubric also shows you when discrepancies arise in real time. As I grade a question, sometimes I adjust how "wrong" I think a particular answer is. Then I have to go back and adjust all the previous exams with that answer to bring the scores in line with my new judgement. If I wasn't recording this, I might not catch these changes, unfairly punishing some students simply because of the order in which the exams were graded.

Grading from a rubric also almost completely eliminated from my classes a time honored tradition at the UofC: point mongering. Being able to state exactly why an answer is right or to what degree it is wrong is huge. Students know you are doing your job and respect you and accept the grade they earn. It also makes it much easier to identify (for the students and the teacher) actual grading errors.

Food for Thought:

1) Should the government take advantage of psychological biases and systematic errors people make when creating policy? To what extent should people be educated about them and then left to their own devices, and to what extent should the government create rules (reducing costly decision making) and "nudge" them?  The government is leaning toward more nudges.

2) The Obama campaign utilized academic social scientists more than ever. How are the biases highlighted in these chapters exploited in political campaigns?

3) How much information does it take for you to reverse your first impression of someone? When was the last time  you realized your first impression of someone was not correct? 

No comments:

Post a Comment