Test fairness is a moral imperative for both the makers and the users of tests. This book focuses on methods for detecting test items that function differently for different groups of examinees and on using this information to improve tests. Of interest to all testing and measurement specialists, it examines modern techniques used routinely to insure test fairness. Three of these relevant to the book's contents are:
* detailed reviews of test items by subject matter experts and members of the major subgroups in society (gender, ethnic, and linguistic) that will be represented in the examinee population
* comparisons of the predictive validity of the test done separately for each one of the major subgroups of examinees
* extensive statistical analyses of the relative performance of major subgroups of examinees on individual test items.
"…this is a fine book. The breadth of coverage will appeal to a wide audience in the testing community, and beyond….It will be the standard reference on DIF for some time."
—Applied Psychological Measurement
Contents: M.J. Ree, Foreword -- DIF: A Perspective From the Air Force Human Resources Laboratory. P.W. Holland, H. Wainer, Preface. Part I:Introduction and Background. W.H. Angoff, Perspectives on Differential Item Functioning Methodology. N.S. Cole, History and Development of DIF. Part II:Statistical Methodology. N.J. Dorans, P.W. Holland, DIF Detection and Description: Mantel-Haenszel and Standardization. D. Thissen, L. Steinberg, H. Wainer, Detection of Differential Item Functioning Using the Parameters of Item Response Models. R.D. Bock, Different DIFs: Comment on the Papers Read by Neil Dorans and David Thissen. H. Wainer, Model-Based Standardized Measurement of an Item's Differential Impact. J.R. Donoghue, P.W. Holland, D.T. Thayer, A Monte Carlo Study of Factors That Affect the Mantel-Haenszel and Standardization Measures of Differential Item Functioning. J.O. Ramsay, Comments on the Monte Carlo Study of Donoghue, Holland, and Thayer. N.T. Longford, P.W. Holland, D.T. Thayer, Stability of the MH D-DIF Statistics Across Populations. R.T. Shealy, W.F. Stout, An Item Response Theory Model for Test Bias and Differential Test Functioning. N.L. Allen, P.W. Holland, A Model for Missing Information About the Group Membership of Examinees in DIF Studies. Part III:Practical Questions and Empirical Investigations. K.A. O'Neill, W.M. McPeek, Item and Test Characteristics That are Associated with Differential Item Functioning. L. Bond, Comments on the O'Neill and McPeek Paper. A.P. Schmitt, P.W. Holland, N.J. Dorans, Evaluating Hypotheses About Differential Item Functioning. C. Lewis, A Note on the Value of Including the Studied Item in the Test Score When Analyzing Test Items for DIF. E. Burton, N.W. Burton, The Effect of Item Screening on Test Scores and Test Characteristics. M. Zieky, Practical Questions in the Use of DIF Statistics in Test Development. R.L. Linn, The Use of Differential Item Functioning Statistics: A Discussion of Current Practice and Future Implications. Part IV:Ancillary Issues. P.A. Ramsey, Sensitivity Review: The ETS Experience as a Case Study. P.H. McAllister, Testing, DIF, and Public Policy. G. Camilli, The Case Against Item Bias Detection Techniques Based on Internal Criteria: Do Item Bias Procedures Obscure Test Fairness Issues? Part V:Concluding Remarks and Suggestions.