It may be only the second week of class, but I have a stack of 45 tests to grade. The students had to answer 10 of 20 questions that generally could be answered in 3 to 5 sentences. Now I have to grade them. How can I possibly do this fairly? What I want is for the points I give a test to reflect what the student knows about the material. What could get in the way?
A lot of problems could arise, I am reminded from the workshop on grading that I went to right before classes began. Beth Fisher had some wise answers: have clear questions; have a grading rubric; make sure your test matches what you have taught. But in a way what most impressed me were the colleagues in attendance who were struggling with fair grading. This came out most strongly in the courses that were so large they had different teaching assistants grading different people. One young professor who was determined to improve mentioned that the teaching assistants had inadvertently both graded some students. Sounds fine, but the problem was that they both wrote their scores on the top of the page, somehow not noticing the other’s score. And the grades each gave were quite divergent, according to the professor. I wonder what kind of pandemonium in the class this led to.
Arbitrariness in the grade is horrifying to the students, for it might affect their whole career. I suppose to us faculty it is our dirty little secret, something few are as open to addressing as my forthright colleague. What can we do? Here are a few thoughts.
1. Have students put their student ID or course assigned number on the test and not their name. It has been very well documented that once we know our students, we give the ones that are generally talkative in class, or have a record of good performance better marks. We see things in their answers that are not there. Conversely, we are more critical of those we think less highly of. If we don’t know who they are, we will grade the questions more objectively. I don’t know many in the class at this early point, but I still had to make myself go back over the answers of the few I know to make sure I wasn’t being too stingy or generous.
2. Grade in clear-cut categories as much as possible. My students have 10 questions, each worth 10 points, with the whole test being worth about 6.7% of the course grade. Instead of grading each question on the scale of 1 to 10, I generally only use 3 numbers, 0, 5, and 10. They get a 0 if nothing is correct about their answer, a 5 if they have some correct information, and some wrong information, or are incomplete. If they are entirely correct and complete, they get 10 points. I’m not a total stickler, so many of them get 10 for most of their questions. A very few situations will give a student a 3 or a 7. This kind of grading helps reduce bias because the judgements are easier.
3. Have the same person grade all of the test, or of a section of the test. If this is not done, then there will be great inconsistencies, no matter how carefully the other parts are attended to. Some people get together after a test to grade together, each person taking a question or a page. This will not always be possible, though, in the large classes we have today. Lab notebooks in particular were mentioned as needing multiple graders. What to do?
4. If there must be multiple graders, standardize them on every assignment. Make a few copies of a few papers and have all the graders grade them. Compare scores, discuss, try it with a new set of papers and repeat until the graders are very uniform. Does this sound like too much trouble? Just think for a moment how important fair grades are for the students.
5. Have a clear rubric and a key. A rubric is just a list of things the assignment calls for and points assigned. If it is too detailed, it will make things harder. The more separate sections you have, the better. For my test, the rubric is quite simple. For the Wikipedia articles the students write, the rubric will be more complicated. Here is my rubric for the first things the students do on Wikipedia, evaluate articles already posted, below. You can see that most of the points are for completion of the category.
|Grading rubric: 70 points in all, 14 points per organism, 5 organisms|
|For each organism:|
|5 points: What are the strengths of this entry? What have you learned that is most interesting?|
|5 points: Name 3 general categories in the outline that are missing and could be included. Explain why for each.|
|4 points: Look at the talk page. Comment on the details here, including the ranking and importance of the article.|
|Full points will be given to entries in each category that are thorough, exhibit careful thinking, and tie to the material of the course. Your writing should be intelligible without going back to the original Wikipedia page.|
6. Grade a given section or question all in one sitting. We change how we grade according to mood. Just look at the decisions from an even more important arena: our justice system. Ed Yong reports on a study of parole granting and finds judges that have eaten recently or are at the beginning of a session are more lenient, to large effect. I purposely did not continue grading after the 5K I ran yesterday when I was feeling extremely mellow.
7. Grade the test or project question by question, piece by piece. You will be more consistent if you grade all of question 5 before grading any other question. Likewise, with a lab report, grade all of a certain section before grading any other section.
8. Mix up the order of the papers. If you are grading question by question, you can easily shuffle the papers a bit when you finish a question. That way each student will get the benefit and cost of position. You may be differently lenient at the beginning when you haven’t seen all the possible answers. You may be desperate for a break towards the end.
9. Be aware of inadvertent bias and try to avoid it. All of these things assume you want to be a fair grader and are trying hard. They address things inherent to human nature. Following these, and I’m sure there are other good tips, will simply make the learning process and its evaluation a more accurate reflection of what a given student is demonstrating on a given assignment.