JUSMIATI, S. Pd.
This paper discusses generally about evaluation of test and more far discuss about test criteria. There are several test criteria which applicable for language testing or the other subjects.
At the end of learning session, testing refers to a means of measuring the quality of something. How much or how far of a particular material might be understand by the students? How much of what the teacher taught can the students remember?
Teacher is the one who can measures the student capability and capacity in the whole process of learning, but it towards to the assessment while evaluation here is more obtaining students performance and about the appropriate teaching strategies. It is important because it can provide information for next teaching direction in the classroom, and better plan for classroom management and learning task management also.
The test included into curriculum at schools and is to check the students’ level of knowledge and what they are able to do; they could be accomplished at the beginning of the study year and at the end of it; the students could be tested after working on new topics and acquiring new vocabulary. Moreover, the students are to face the tests in order to enter any foreign university or reveal the level of their English language skills for themselves. For that purpose they take specially designed tests that are Test of English as a Foreign Language, or TOEFL test (further in the text) and CFC (further in the text), or Cambridge First Certificate. Although, these tests can sometimes serve for different purposes and are unrelated, they are sometimes quite common in their design and structure. Therefore, the author of the paper is particularly interested in the present research, for she assumes it to be of a great significance not only for herself, but also for the individuals who are either involved in the field or just want to learn more about TOEFL and CFC tests, their structure, design and application. Therefore, the present research will display various aspects of the theory discussed, accompanied with the practical part vastly analyzed.
Thompson (Forum, 2001) believes that students learn more when they have tests. Here we can both agree and disagree. Certainly, preparing for a test, the student has to study the material that is supposed to be tested, but often it does not mean that such type of learning will obligatory lead to acquisition and full understanding of it
1. THE DEFINITION OF TEST EVALUATION
Evaluation is an intrinsic part of teaching and learning, which defines as making judgments about students learning, it means to evaluate the learning process, how the teacher manage the class and how the teacher delivering the material
Talking about test evaluation, there are several test criteria that will be explained, they are: Test Validity
1. Test Validity
Test validity is a key whether the test really measures what it intends to measure in accordance with a certain capability standard. A valid test is one that measures the learning objectives realistic and effectively, while a test with insufficient validity test score have no meaning. For instance, if you want to measure the student’s capability in speaking you just have to give them oral test instead of giving them written multiple choice tests. Validity divided into three kinds, they are:
This kind of validity depends on the analysis of the language being test, it should be constructed by the representative sample of the course. Face Validity
Face validity of the items is determined by a review and not through formal analysis, anyone who looks over the test may see that the test item appears as valid even though sometimes that don’t measure what supposed to be.
Construct validity is when the test capable of measuring certain characteristics in accordance with a theory of language and learning. Empirical Validity
Empirical validity is statistical method using correlation, rather than a logical method. Once the tests have been scored, the relationship is estimated between the examinees’ known status as either as pass or fail based on the test.
2. Test Reliability
Test reliability is the aspect of test quality concerned with whether or not a test produces consistent results. While there are several methods for estimating test reliability, for objective test the most useful types are probably test-retest reliability, parallel forms reliability, and decision consistency. For it to be valid at all it should be reliable first as a measuring instruments.
Technically, reliability shows the extent to which test scores are free from errors of measurement. No classroom test is perfectly reliable because random errors operate to cause scores to vary or be inconsistent from time to time and situation to situation. The goal is to try to minimize these inevitable errors of measurement and thus increase reliability.
Reliability measures are concerned with determining the degree of inconsistency in scores due to random error. Although it is not possible to obtain perfectly reliable scores in measuring classroom achievement, some teachers are able to construct tests that have reliability coefficients of 0.90 and above. We should strive to write tests that yield reliability coefficients of at least .70. (C. Jacobs-1991) There are many factors affecting reliability:
Lack of instruction
Every test should have instruction for whole test or each session of it. Clarity
If a test is difficult to read because of spelling mistakes, sloppiness, hand-written items. The student might answer the question incorrectly through no fault of his or her own. Ambiguity
If the sentence of a question is puzzling or if the directions are unclear the student may provide an answer that is not the answer you are looking for. Statistical random error
The way of guessing may give a chance for lucky guess.
Length and variety of the test
If the test consists of too many questions, it may causes boredom 3. Test Difficulty
Content difficulty refers to the difficulty in the subject matter assessed. In the assessment of knowledge, the difficulty of a test item resides in the various elements of knowledge such as facts, concepts, principles and procedures. These knowledge elements may be basic, appropriate or advanced. Basic knowledge elements are those in which students have learnt at lower levels.
4. Test Applicability
Certain tests in the suite may be considered inapplicable to an implementation depending on the way the implementation treats the implementation-dependent features of the language.
5. Test Relevance
Make sure that the test instruments sufficiently correlated with the subject matters. In testing relevance, for example, the teacher may tests whether measures are pertinent, inclusive, timely, and understandable.
6. Test Interpretability
Interpretability here means accuracy. The accuracy of test instrument in measuring what is intended to be measured. Suitability of imagery for interpretation with respect to answering adequately requirements on a given type of target in terms of quality and scale.
Evaluation is a part of teaching and learning process, it is important because it provides information use for the future direction of classroom practice, for the planning and management task. It also can be a self evaluation for the teacher. While the test itself divided into several criteria, they are; test validity, test difficulty, test reliability, test applicability, test relevance, and test interpretability.
In G. Fulcher & Davisdon. 2012. The Routledge handbook of language testing. New York. Jabu. 2008. English Language Testing. Makassar.
J. B. Heaton. 1975. Writing English Language Test. Longman
Jacobs, Lucy C. 1991. Test Reliability. http://www.indiana.edu/~best/bweb3/test-reliability/. Professional testing. Rea Pauline – Dickins & Kevin Germaine. 1992. Evaluation.
Thompson, M. 2001. Putting students to the test. Issue Twenty. Forum. July