The overall problem with psychological tests concerns their ability to measure what they are supposed to measure.
The accuracy, or usefulness, of a test is known as its validity. For example, suppose you wanted to develop a test to determine which of several job applicants would work well in a bank. Would an arithmetic test be a valid test of job success? Well, not if the job required other skills, such as manual dexterity or social skills.
Construct Validity refers to the ability of a test to measure the psychological construct, such as depression, that it was designed to measure. One way this can be assessed is through the test’s convergent or divergent validity, which refers to whether a test can give results similar to other tests of the same construct and different from tests of different constructs.
Content Validity refers to the ability of a test to sample adequately the broad range of elements that compose a particular construct.
Criterion-related Validity refers to the ability of a test to predict someone’s performance on something. For example, before actually using a test to predict whether someone will be successful at a particular job, you would first want to determine whether persons already doing well at that job (the criterion measure) also tend to score high on your proposed test. If so, then you know that the test scores are related to the criterion.
The ability of a test to give consistent results is known as its reliability. For example, a mathematics test that asks you to solve problems of progressive difficulty might be very reliable because if you couldn’t do calculus yesterday you probably won’t be able to do it tomorrow or the next day. But a personality test that asks ambiguous questions which you answer just according to how you feel in the moment may say one thing about you today and another thing about you next month.
Internal Consistency Reliability refers to how well all the test items relate to each other.
Test-retest Reliability refers to how well results from one administration of the test relate to results from another administration of the same test at a later time.
Note that without reliability, there can be no validity. A thermometer, for example, may be a valid way to measure temperature, but if the electronic thermometer you are using has bad batteries and it gives erratic (that is, unreliable) results, then its reading is invalid until the batteries are changed.
Note also that no psychological test is ever completely valid or reliable because the human psyche is just too complicated to know anything about it with full confidence. That’s why there can be such uncertainty about a case even after extensive testing.
“Stretching” Validity: The MMPI and Occupational Screening
A classic problem with validity arises when someone uses a test for a purpose for which it was not designed. The MMPI, for example, was designed to measure pathological personality traits, yet it (or the MMPI-2) is often used as a screening tool for law enforcement, seminary students, firefighters/paramedics, airline pilots, medical/psychology students, and nuclear power facility workers. Many persons therefore wonder if this is an appropriate use for the test.
Common sense can tell us what personality characteristics make good police officers, for example. They should have good self-esteem, yet not overvalue themselves. They should be energetic, yet not be so involved with so many activities as to be ineffective. They should have good impulse control and be able to tolerate insult without becoming irritable. They shouldn’t hold personal grudges but should be fair, and kind, and objective. They should be obedient to authority and yet be able to make good judgments independently. They should have stamina when under threat. And so on.
But these are complex qualities. How do you measure them?
For example, you could ask a person if he is honest,...