Reliability Exercise

Only available on StudyMode
  • Download(s) : 179
  • Published : September 15, 2012
Open Document
Text Preview
This test of novel problem solving is a measure of fluid intelligence (Doubleday, King, & Papageorgiou, 2002). People’s ability to solve novel problems is a stable characteristic, as it is largely genetically determined (Nairne, 2009). Test-retest is typically appropriate for measures with stable attributes, but this test’s novel nature makes it an inappropriate technique in regard to reliability. In effect, its novelty diminishes after the initial testing, producing difficulties due to practice effects, reactivity, or both. Since it has just 20 questions, furthermore, it is easier for examinees to remember a significant portion of its items and therefore either to remember the answers during the retest or to seek them out during the interval, resulting in erroneous score improvements (Yu, 2005). As it is impossible to discern the precise influences of any one factor, the interpretation of a test-retest coefficient is challenging, and with more appropriate reliability measures available temporal stability should not be used for this test. Alternate-forms reliability eliminates some of the reactivity associated with test-retest, but it is nonetheless an inappropriate reliability measure for this test due to the possible carryover effects of strategy. Even when each specific item’s content is novel or unfamiliar, examinees may accustom themselves to the test’s style and subsequently apply the same principle used to solve one problem to another (Groth-Marnat, 2009). Truly equivalent forms are already difficult to develop, but together with the increasing difficulty of items in this test, assuming that no two items are the same, it makes generating a reliable alternate form unfeasible. This test’s dichotomous scoring protocol is designed to assess problem-solving ability objectively with questions being answered either correctly or incorrectly. Such a standardised procedure independently considerably eliminates subjective influence, and assessing inter-rater...
tracking img