CS150 Spring 2009 Midterm Grading

The Spring 2009 offering of CS150 was used as a polygon to field-test and certify a novel approach to the grading of exams. As you may imagine, exam grading is a long and inexact process, and most closely parallels integer linear programming problem with tight constraints (positive individual scores, B-/C+ average, high corellation with other grades, etc.). This problem is well known to be complex, making solution via manual search intractable.At the same time, stochastic and heuristic methods are not often able to provide the correlated scores needed to model course grades. As a result, a simple random assignment is only seldom satisfactory, and is usually replaced by failsafe methods such as the use of undergraduate readers, uniform scoring (C+), or weighted averaging of previously graded assignments.

Teaching staff at leading universities have long sought to replace traditional (and ineffective) methods with an efficient, highly random, yet objective and precise method of grading. We believe that such a method was finally discovered and published by Daniel Solove, a visionary of academic performance assessment. Since complete accuracy in grading is not usually attainable, a stochastic process is used to elliminate the effects of bias and grader error (see figure 2). To approve the grading method for department-wide use, we have conducted an experiment during the 3/31/09 CS150 midterm, testing the new grading methodology against traditional practices.

You will receive 2 grades when your midterms are returned: a "traditional" grade, which will be used as the ground truth, and the "experimental" grade obtained via the novel method. Preliminary results show a strong correlation (over 0.4) in favor of the new method. We predict that the new approach to grading will completely replace traditional grading practices by year 2011. Upon successful completion of the experiment, we will likely employ only the new grading practice to grade the CS150 final exams (to save the expense of double-grading, which is discouraged by the budget-aware guidelines).

The experiment sought to optimize the efficiency of grading begins with the stack of exams, shown in Figure 1 below. The first grade was obtained using a traditional practice of exam grading (undisclosed). The method used to obtain the 2nd, "experimental" grade is detailed below.

Exam-Grade-1a.jpgExam-Grade-10a.jpg

The key to this method is a good toss. Without a good toss, it is difficult to get a good spread for the grading curve. It is also important to get the toss correct on the first try. Exams can get crumpled if tossed too much. They begin to look as though the professor actually read them, and this is definitely to be avoided. Additional tosses are also inefficient and expend needless time and energy. Note the toss in Figure 3 below. This is an example of a toss of considerable skill — obviously the result of years of practice.

Exam-Grade-2a.jpg

Note in Figure 3 above that the exams are evenly spread out, enabling application of the curve. Here, however, is where the experts diverge. Some contend that the curve ought to be applied as in Figure 4 below, with the exams at the bottom of the staircase to receive a lower grade than the ones higher up on the staircase.

Exam-Grade-4a.jpg

According to this theory, quality is understood as a function of being toward the top, and thus the best exams clearly are to be found in this position. Others, however, propose an alternative theory (Figure 5 below).

Exam-Grade-3a.jpg

They contend that that the exams at the bottom deserve higher grades than the ones at the top. While many professors still practice the top-higher-grade approach, the leading authorities subscribe to the bottom-higher-grade theory, despite its counterintuitive appearance. The rationale for this view is that the exams that fall lower on the staircase have more heft and have traveled farther. The greater distance traveled indicates greater knowledge of the subject matter. The bottom higher-grade approach is clearly the most logical and best-justified approach.

Even with the grade curve lines established, grading is far from completed. Several exams teeter between levels. The key is to measure the extent of what is referred to as “exam protrusion.” Exams that have small portions extending below the grade line should receive a minus; exams with protrusions above the grade lines receive a plus.

But what about exams that are right in the middle of a line. In Figure 6 below, this exam teeters between the A and B line. Should it receive and A- or a B+?

Exam-Grade-9a.jpg

This is a difficult question, but I believe it is clearly an A-. The exam is already bending toward the next stair, and in the bottom-higher-grade approach, it is leaning toward the A-. Therefore, this student deserves the A- since momentum is clearly in that direction.

Finally, there are some finer points about grading that only true masters have understood. Consider the exam in Figure 7 below. Although it appears on the C stair and seems to be protruding onto the B stair, at first glance, one would think it should receive a grade of C+. But not so. A careful examination reveals that the exam is crumpled. Clearly this is an indication of a sloppy exam performance, and the grade must reflect this fact. The appropriate grade is C-.

Exam-Grade-7a.jpg

One final example, consider in Figure 8 below the circled exam that is is very far away from the others at the bottom of the staircase. Is this an A+?

Exam-Grade-5a.jpg

Novices would think so, as the exam has separated itself a considerable distance from the rest of the pack. However, the correct grade for this exam is a B. The exam has traveled too far away from the pack, and will lead to extra effort on the part of the grader to retrieve the exam. Therefore, the exam must be penalized for this obvious flaw.

As you can see, grading takes considerable time and effort. But students can be assured that modern grading techniques will produce the most precise and accurate grading possible, assuming professors have achieved mastery of the necessary grading skills.


The CS150 course staff thanks Daniel Solove for the idea and material presented on this page.