Validity: An Exploration
 Assessment in School   Systems
 What do Items Really Test?
 Evolution in Action
 To See a Test in a Grain of   Sand...
 Analyzing Items and Tasks
 Designing an Alternative   Matrix
 Administration and   Alignment
 In a Time Far Far Away...

   

Related Weblinks

To See a Test in a Grain of Sand...

In this part of the book we discuss 'evidence-centered design', or ECD. This is a formalized approach to crafting test tasks that emphasizes evidence: logically defensible data and argument that supports the inferences from the test results. ECD is right in line with newer approaches to test validation, in which logical argument is the basis for establishing validity – more accurately, validity is never 'established', it is 'argued', and like all arguments, evidence is needed.

ECD was first proposed by Robert Mislevy and colleagues at Educational Testing Service. They proposed not only the concept but also developed a software package (called 'PORTAL') to help develop evidence-centered tests. Please take a moment to review these online write-ups about ECD and/or its applications:

ECD is essentially a highly-articulated form of test specifications. A specification (or 'spec') is a generative document from which many equivalent items or tasks can be created. Specs have two basic elements: sample(s) of the items/tasks they intend to produce, and extensive 'guiding language', which is everything else: description of where to find materials, advice on how to train raters (if it is a rated task), claims about theory underlying the test, tips on administering the test, and so on.

Test specs are a well-established part of all test development, and many testing systems make use of specs – even so far as to publish them in public venues, for example:

Alternatively, you can put the phrase 'test specification' into any web search engine, and you will get many further results.

  • Note the variation in test specification design, from the highly elaborated model proposed by Mislevy and colleagues to the natural variety seen in the Oregon, California, and QCA examples. What do you think is the best model for a test spec? How detailed should it be? Should specs follow a common model, so that the evidentiary claims they produce can also be standardized?

Copyright © 2006 Taylor & Francis Group, an informa business