|
Related Weblinks
Validity: An Exploration
Validity is the degree to which the inference (from a test score)
is defensible. We should not ask whether or not 'Test X is
valid', but rather: 'is this particular use of scores
from Test X valid'? A test result can be used properly or
improperly. This means that test consequence is part of validity,
whereas in the past it was downplayed. A good historical overview
of these matters is presented by Chong
Ho Yu.
Our belief about validity takes the matter of a bit further, and
in our book we advocate “effect-driven testing”, which
we define as keeping impact and consequence in mind every step of
the way during test development. To illustrate effect-driven testing,
here is an excerpt of an exercise from Unit C1:
- Select a test with which you are familiar, preferably one that
is at present still in development. Imagine: it is some two years
hence. The test is fully operational. It yields results on which
actual decisions are made.
Imagine two test takers walking out of your testing room in that
future, engaged in a conversation:
A: Well, that was not as rough as I thought it would be.
B: Not too bad. I agree.
A: About what I expected.
B: Just about. I agree.
A: Then why do I feel I have not learned anything? Why do I feel
that all I did was to take a test?
B: I don't know. I feel the same way.
Our activity in the book then asks what you did in test development
to lead to this situation. The test-takers are not complaining,
but at the same time, they feel vague disquiet about the educational
impact of your test. You probably intended your test to be a challenging
assessment of some important material, and you probably hoped that
students' preparation would help them to learn. You did not
get it quite right, and despite your best efforts, you overhear
a conversation such as this. What logical steps could you have taken
during test development to avoid this outcome?
Taken to an extreme, test consequence can lead to court challenges
of testing. Validity has legal implications, many of which are effect-driven,
and (we think) many of which could have been avoided if the test
developers had taught more about test effect as the test was developed.
For example, Rudner
and Farris present an interesting description of a validity
challenge in a home-schooling situation. Another famous class-action
case was argued in Florida in the 1980s: Debra
P vs. Turlington; that case was also a matter of test validity
because it dealt with the matter of test effect (or, as was argued,
effect that transcended the test's original purpose).
Our book is influenced by the philosophy
of pragmatism and the thinking of Charles
Sanders Peirce, whose writings are being edited into a major
new chronological
edition. Our endorsement of Peirce's pragmatism echoes both
the argumentative clarity of his work as well as the broader social
canvas on which modern pragmatists have painted – most notably,
Richard
Rorty. In his later career, Peirce strove to distance himself
from (in particular) William
James, who was among the first to extend pragmatism to matters
of society. Peirce even renamed his view to 'pragmaticism',
for he viewed effect as a matter of logical clarity and not as vast
social change and politically progressive thought.
The “interesting thing about the thing” (to quote Elwood
P. Dowd, a close friend of Harvey)
is that close focused logical work (pragmaticism) probably
makes social change and progress (pragmatism) feasible
at all. Consider again our two speakers in the C1 excerpt above.
These two people (at least) do not seem to be planning a court action.
At the same time, there was little positive washback
on their learning. Working through the logical imaginary history
of this test will help you to understand better both its desirable
impact as well as effects that you may wish to improve to avoid
such conversations. Test development is a matter of logical step-by-step
reasoning and close focused work, true to Peirce's later vision.
If you had gotten it right and if the test-takers were both satisfied
with the test and felt that they'd learned through their study,
then that's the most desirable effect and certainly, it is one that
reaches beyond the mere logical features of test development. It
reaches into their lives. |

|