According to Van der Linden (1982), the rise of new learning strategies has changed the meaning of measurement in education and made new demands on the construction, scoring, and analysis of educational tests. Educational measurements satisfying these demands are usually called criterion-referenced, while traditional measurements are often known as norm-referenced. Thus, educational tests can be categorised into two major groups: norm-referenced tests and criterion-referenced tests. The common feature of these learning strategies is their objective-based character. All lead to instructional programmes being set up and executed according to well-defined, clear-cut learning objectives (Van der Linden, 1982).These two tests, however, differ in their intended purposes, the way in which content is selected, and the scoring process which defines how the test results must be interpreted. This paper will discuss the role and differences between these two types of assessments and explain the most appropriate uses of each.
Exposition and overview of the two key concepts
Glaser (1963) confronted two possible uses of educational tests and their areas of application. The first is that tests can supply norm-referenced measurements. In norm-referenced measurement the performances of subjects are scored and interpreted with respect to each other. As the name indicates, there is always a norm group, and the interest is in the relative standing of the subjects to be tested in this group. This finds expression in scoring methods as percentile scores, normalised scores, and age equivalents. Tests are constructed such that the relative positions of subjects come out as reliably as possible. An outstanding example of an area where norm- referenced measurements are needed is testing for selection of applicants for a job. In such applications the test must be maximally differentiating in order to enable the employer to select the best applicants. Thus a norm-referenced test is a type of test, assessment, or evaluation which yields an estimate of the position of the tested individual in a predefined population, with respect to the trait being measured. This estimate is derived from the analysis of test scores and possibly other relevant data from a sample drawn from the population. This type of test identifies whether the test taker performed better or worse than other test takers, but not whether the test taker knows either more or less material than is necessary for a given purpose.
The second use is that tests can supply criterion-referenced measurements. In criterion-referenced measurement the interest is not in using test scores for ranking subjects on the continuum measured by the test, but in carefully specifying the behavioural referents (the "criterion") pertaining to scores or points along this continuum. Measurements are norm-referenced when they indicate how much better or worse the performances of individual subjects are compared with those of other subjects in the norm group; they are criterion-referenced when they indicate what performances a subject with a given score is able to do, and what his behavioural repertory is, without any reference to scores of other subjects. Thus a criterion-referenced test is a test that provides a basis for determining a candidate's level of knowledge and skills in relation to a well-defined domain of content. Often one or more performance standards are set on the test score scale to aid in test score interpretation. Criterion-referenced tests are also known as domain-referenced tests, competency tests, basic skills tests, mastery tests, performance tests or assessments, authentic assessments, objective-referenced tests, standards-based tests, credentialing exams, and more (Popham and Husek, 1969). What all of these tests have in common is that they attempt to...