### Mon, 14 Sep 2015

education, evaluation, philosophy

There are two different types of evaluations that are often mixed up.

In a context where there are a limited number of "awards" and a large number of applicants, we need to pick those who "deserve" the awards. In order to do this we are forced to rank the applicants. Such an evaluation classifies as a ranking evaluation.

In other contexts, we are trying to determine whether a given individual has met certain requirements. There may be many such requirements and the examiner may want to classify (assign a grade) according to the number of requirements met. Even when this is one individual out of a big population of examinees, this classifies as a grading evaluation.

In a ranking examination, factors which may be only marginally relevant to the awards can be (and often have to be!) used to create a "total ordering" in a mathematical sense. As long as these factors are fairly chosen and applied, and are known to the participants in advance, most people, however unhappy they may be at having "lost out", should (and perhaps will!) grant the acceptability of the ranking.

A typical classroom evaluation or course evaluation is meant to be a grading evaluation. The question asked is whether the examinee has achieved acceptably competence in a certain list of tasks or has acquired adequate information about certain areas. In such a situation, the examiner needs to give a fair opportunity for the examinee to demonstrate the skill or knowledge gained. If the examinee has been told in advance what is expected, and is examined and graded only on these aspects, then the evaluation will be considered as fair.

Since these descriptions are as different as tomatoes and potatos, why is it that the two forms of evaluation are mixed up? Why do we have school boards attempting to rank students when all we need are grades? When admissions are decided based on ranks, why do we get worked up when candidates who are good are left out even though we do not dispute the quality of those selected?

Most of us know and will acknowledge that the value of any ranking is ephemeral (has a short expiry date!). Arguing over lists of the "ten best ... of all time" is a wonderful pastime, but we should not be serious about it! Yet, we find it being taken seriously.

Part of the reason is that competition is perceived as a spur. From the time when humans were escaping into trees to get away form lions to the more recent race to the moon, we have seen humans achieve more than even they considered possible, when egged on by the competitive ethos. Since we think that ranking (being competitive) can bring out higher levels than grading, we often institute ranking, where grading would be sufficient.

Another reason is nagging self-doubt amongst the educators that perhaps the exam is pitched too low or too high; no honest examiner will ever claim that they have set the "perfect" test! If all the students meet the maximal requirements of a certain grading system, then we feel that we have not demanded enough. On the other hand if most students do not meet our expectations, we feel that we may have been too harsh.

Yet nother reason is that evaluators are aware that their grading will be used competitively! So they feel that clubbing a lot of excellent students in the 'A' category is "unfair" to the outstanding students who should be helped to "stand out".

"Relative" grading (or grading on a curve) is supposed to resolve some of these issues, but has often been seen to come up short. Moreover, it seems to violate the fundamental distinction between ranks and grades. It often creates artificial divisions in order to satisfy a general sense that there should not be "too many 'A's" (or "too many 'F's"!).

It must be obvious that this writer's view is that ranking evaluations should be limited to the settings where such ranking is almost unavoidable. Even in such situations, statisticians have shown that the bulk of the actual rankings (those close to the mean) are not very different from a random ordering and perhaps a "lottery" is more appropriate at the middle (and upper middle) levels.

In all other situations, educators should attempt to honestly classify, in advance, what would constitute a particular grade for a certain examination and then limit themselves to measure and award the appropriate grade, whatever be the form of the resulting distribution of grades.

