Validity: An Exploration
 Assessment in School   Systems
 What do Items Really Test?
 Evolution in Action
 To See a Test in a Grain of   Sand...
 Analyzing Items and Tasks
 Designing an Alternative   Matrix
 Administration and   Alignment
 In a Time Far Far Away...

   

Related Weblinks

Validity: An Exploration

Validity is the degree to which the inference (from a test score) is defensible. We should not ask whether or not 'Test X is valid', but rather: 'is this particular use of scores from Test X valid'? A test result can be used properly or improperly. This means that test consequence is part of validity, whereas in the past it was downplayed. A good historical overview of these matters is presented by Chong Ho Yu.

Our belief about validity takes the matter of a bit further, and in our book we advocate “effect-driven testing”, which we define as keeping impact and consequence in mind every step of the way during test development. To illustrate effect-driven testing, here is an excerpt of an exercise from Unit C1:

  • Select a test with which you are familiar, preferably one that is at present still in development. Imagine: it is some two years hence. The test is fully operational. It yields results on which actual decisions are made.

    Imagine two test takers walking out of your testing room in that future, engaged in a conversation:

    A: Well, that was not as rough as I thought it would be.
    B: Not too bad. I agree.
    A: About what I expected.
    B: Just about. I agree.
    A: Then why do I feel I have not learned anything? Why do I feel that all I did was to take a test?
    B: I don't know. I feel the same way.

Our activity in the book then asks what you did in test development to lead to this situation. The test-takers are not complaining, but at the same time, they feel vague disquiet about the educational impact of your test. You probably intended your test to be a challenging assessment of some important material, and you probably hoped that students' preparation would help them to learn. You did not get it quite right, and despite your best efforts, you overhear a conversation such as this. What logical steps could you have taken during test development to avoid this outcome?

Taken to an extreme, test consequence can lead to court challenges of testing. Validity has legal implications, many of which are effect-driven, and (we think) many of which could have been avoided if the test developers had taught more about test effect as the test was developed. For example, Rudner and Farris present an interesting description of a validity challenge in a home-schooling situation. Another famous class-action case was argued in Florida in the 1980s: Debra P vs. Turlington; that case was also a matter of test validity because it dealt with the matter of test effect (or, as was argued, effect that transcended the test's original purpose).

Our book is influenced by the philosophy of pragmatism and the thinking of Charles Sanders Peirce, whose writings are being edited into a major new chronological edition. Our endorsement of Peirce's pragmatism echoes both the argumentative clarity of his work as well as the broader social canvas on which modern pragmatists have painted – most notably, Richard Rorty. In his later career, Peirce strove to distance himself from (in particular) William James, who was among the first to extend pragmatism to matters of society. Peirce even renamed his view to 'pragmaticism', for he viewed effect as a matter of logical clarity and not as vast social change and politically progressive thought.

The “interesting thing about the thing” (to quote Elwood P. Dowd, a close friend of Harvey) is that close focused logical work (pragmaticism) probably makes social change and progress (pragmatism) feasible at all. Consider again our two speakers in the C1 excerpt above. These two people (at least) do not seem to be planning a court action. At the same time, there was little positive washback on their learning. Working through the logical imaginary history of this test will help you to understand better both its desirable impact as well as effects that you may wish to improve to avoid such conversations. Test development is a matter of logical step-by-step reasoning and close focused work, true to Peirce's later vision. If you had gotten it right and if the test-takers were both satisfied with the test and felt that they'd learned through their study, then that's the most desirable effect and certainly, it is one that reaches beyond the mere logical features of test development. It reaches into their lives.

Copyright © 2006 Taylor & Francis Group, an informa business