All data are the result of human actions whether by experimentations, observations, or declarations. As such, the presumption of knowing what data are about is subject of imperfections that can affect the validity of research efforts. With calls for data-based research comes the need to assure the reliability of generated data. Especially the reliability of converting texts into analyzable data has become a burning issue in several areas. However, this issue has been met by only a few limited, and sometimes misleading measures of the extent to which data can be trusted as surrogates of the phenomena of analytical interests. The statistic proposed by the author – "Krippendorff’s Alpha" – is widely used in the social sciences, not only where human judgements are involved but also where measurements are compared.
The Reliability of Generating Data expands on the author’s seminal work in content analysis and develops methods for assessing the reliability of the kind of data that previously defied evaluations for this purpose. It opens with a discussion of the epistemology of reliable data, then presents the most basic alpha coefficient for the single-valued coding of predefined units. This largely familiar way of measuring reliability provides the platform for the succeeding chapters which start with an overview of alternative coefficients and then expand alpha one quality after another, including to cope with the reliabilities of multi-valued coding, segmenting texts into meaningful units, big data, and information retrievals. It also includes a chapter on how to diagnose and remedy imperfections and one on applicable standards, all converging on the statistical issues of the reliability of generating data.
- Provides an overview of methods for assessing the reliability of generating data
- Expands a statistic proposed by the author, already widely used in the social sciences
- Includes many easy to follow numerical examples to illustrate the measures
- Written to be useful to beginning and advanced researchers from many disciplines, notably linguistics, sociology, psychometric and educational research, and medical science.
Table of Contents
How I became interested in reliability issues. 1. On the epistemology of reliable data. 2. Simplest kinds: The replicability of categorizing predefined units. 3. Some properties of the Alpha. 4. Alpha compared with primarily nominal agreement measures. 5. Metric differences between single-valued units.6. The quadrilogy for single-valued predefined units and big data. 7. Multi-valued coding of predefined units.8. Partitioning continua and coding relevant segments. 9. Preserving the coherency of identified segments in continua. 10. Distinctions drawn within continua. 11. Text mining and information retrieval. 12. Diagnostic devices and remedial actions. 13. Some special applications. 14. Statistical considerations. 15. Reliability standards. 16. Toward a general calculus of differences and agreements. Appendix. References
Klaus Krippendorff, PhD., PhD.hc, graduate from the Ulm School of Design and the University of Illinois, Urbana, is the Gregory Bateson Professor Emeritus for Cybernetics, Language, and Culture at the Annenberg School for Communication, University of Pennsylvania. He wrote his dissertation on content analysis at a time this method was quite underdeveloped. Content Analysis, An Introduction to its Methodology became a leading text now in its 4th Edition. The book earned the International Communication Association (ICA)’s 2004 recognition as the most influential work. The issue of reliability followed him into numerous empirical ventures. It taught him that data cannot be taken as the starting point of scientific research without knowing how they came about, what they mean, and for whom besides their analyst. In 2012, the Methodology Division of the Association for Education in Journalism and Mass Communication (AEJMC) recognized his "Agreement and Information in the Reliability of Coding" as their "Article of the Year." He wrote over a hundred frequently cited publications, not only on reliability but also On Communicating, Otherness, Meaning, and Information. In the area of design, he advocated a human interactive approach to understanding technological artifacts as well as of the design discourse that guide them: The Semantic Turn, A new Foundation for Design. In the area of Cybernetics, he received several awards for his contributions in the form of books and articles in academic journals. Since some time he wrote and taught seminars on the discursive construction of realities, not just of the social world but also of what natural scientists say they describe. He co-edited Discourses in Action; What Language Enables Us to Do. Currently, he critically explores how such constructions impact everyday life, for example, the algorithms we live with. His focus is on the possibilities of emancipation from oppression due to widely shared but burdensome realities, mistaken as unalterable facts. This frames the first chapter of the book, starting with answering the question "When are Data?"