2nd Edition

# Bayesian Statistics for the Social Sciences, Second Edition

250 Pages
by Guilford Press

The second edition of this practical book equips social science researchers to apply the latest Bayesian methodologies to their data analysis problems. It includes new chapters on model uncertainty, Bayesian variable selection and sparsity, and Bayesian workflow for statistical modeling. Clearly explaining frequentist and epistemic probability and prior distributions, the second edition emphasizes use of the open-source RStan software package. The text covers Hamiltonian Monte Carlo, Bayesian linear regression and generalized linear models, model evaluation and comparison, multilevel modeling, models for continuous and categorical latent variables, missing data, and more. Concepts are fully illustrated with worked-through examples from large-scale educational and social science databases, such as the Program for International Student Assessment and the Early Childhood Longitudinal Study. Annotated RStan code appears in screened boxes; the companion website (www.guilford.com/kaplan-materials) provides data sets and code for the book's examples.

New to This Edition
*Utilizes the R interface to Stan--faster and more stable than previously available Bayesian software--for most of the applications discussed.
*Coverage of Hamiltonian MC; Cromwell’s rule; Jeffreys' prior; the LKJ prior for correlation matrices; model evaluation and model comparison, with a critique of the Bayesian information criterion; variational Bayes as an alternative to Markov chain Monte Carlo (MCMC) sampling; and other new topics.
*Chapters on Bayesian variable selection and sparsity, model uncertainty and model averaging, and Bayesian workflow for statistical modeling.

I. Foundations
1. Probability Concepts and Bayes' Theorem
1.1 Relevant Probability Axioms
1.1.1 The Kolmogorov Axioms of Probability
1.1.2 The Rényi Axioms of Probability
1.2 Frequentist Probability
1.3 Epistemic Probability
1.3.1 Coherence and the Dutch Book
1.3.2 Calibrating Epistemic Probability Assessment
1.4 Bayes' Theorem
1.4.1 The Monty Hall Problem
1.5 Summary
2. Statistical Elements of Bayes' Theorem
2.1 Bayes' Theorem Revisited
2.2. Hierarchical Models and Pooling
2.3 The Assumption of Exchangeability
2.4 The Prior Distribution
2.4.1 Non-informative Priors
2.4.2 Jeffreys' Prior
2.4.3 Weakly Informative Priors
2.4.4 Informative Priors
2.4.5 An Aside: Cromwell's Rule
2.5 Likelihood
2.5.1 The Law of Likelihood
2.6 The Posterior Distribution
2.7 The Bayesian Central Limit Theorem and Bayesian Shrinkage
2.8 Summary
3. Common Probability Distributions and Their Priors
3.1 The Gaussian Distribution
3.1.1 Mean Unknown, Variance Known: The Gaussian Prior
3.1.2 The Uniform Distribution as a Non-informative Prior
3.1.3 Mean Known, Variance Unknown: The Inverse-Gamma Prior
3.1.4 Mean Known, Variance Unknown: The Half-Cauchy Prior
3.1.5 Jeffreys' Prior for the Gaussian Distribution
3.2 The Poisson Distribution
3.2.1 The Gamma Prior
3.2.2 Jeffreys' Prior for the Poisson Distribution
3.3 The Binomial Distribution
3.3.1 The Beta Prior
3.3.2 Jeffreys' Prior for the Binomial Distribution
3.4 The Multinomial Distribution
3.4.1 The Dirichlet Prior
3.4.2 Jeffreys' Prior for the Multinomial Distribution
3.5 The Inverse-Wishart Distribution
3.6 The LKJ Prior for Correlation Matrices
3.7 Summary
4. Obtaining and Summarizing the Posterior Distribution
4.1 Basic Ideas of Markov Chain Monte Carlo Sampling
4.2 The Random Walk Metropolis–Hastings Algorithm
4.3 The Gibbs Sampler
4.4 Hamiltonian Monte Carlo
4.4.1 No-U-Turn (NUTS) Sampler
4.5 Convergence Diagnostics
4.5.1 Trace Plots
4.5.2 Posterior Density Plots
4.5.3 Auto-Correction Plots
4.5.4 Effective Sample Size
4.5.5 Potential Scale Reduction Factor
4.5.6 Possible Error Messages When Using HMC/NUTS
4.6 Summarizing the Posterior Distribution
4.6.1 Point Estimates of the Posterior Distribution
4.6.2 Interval Summaries of the Posterior Distribution
4.7 Introduction to Stan and Example
4.8 An Alternative Algorithm: Variational Bayes
4.8.1 Evidence Lower Bound (ELBO)
4.8.2 Variational Bayes Diagnostics
4.9 Summary
II. Bayesian Model Building
5. Bayesian Linear and Generalized Models
5.1 The Bayesian Linear Regression Model
5.1.1 Non-informative Priors in the Linear Regression Model
5.2 Bayesian Generalized Linear Models
5.3 Bayesian Logistic Regression
5.4 Bayesian Multinomial Regression
5.5 Bayesian Poisson Regression
5.6 Bayesian Negative Binomial Regression
5.7 Summary
6. Model Evaluation and Comparison
6.1 The Classical Approach to Hypothesis Testing and Its Limitations
6.2 Model Assessment
6.2.1 Prior Predictive Checking
6.2.2 Posterior Predictive Checking
6.3 Model Comparison
6.3.1 Bayes Factors
6.3.2 The Deviance Information Criterion (DIC)
6.3.3 Widely Applicable Information Criterion (WAIC)
6.3.4 Leave-One-Out Cross-Validation
6.3.5 A Comparison of WAIC and LOO
6.4 Summary
7. Bayesian Multilevel Modeling
7.1 Revisiting Exchangeability
7.2 Bayesian Random Effects Analysis of Variance
7.3 Bayesian Intercepts as Outcomes Model
7.4 Bayesian Intercepts and Slopes as Outcomes Model
7.5 Summary
8. Bayesian Latent Variable Modeling
8.1 Bayesian Estimation for the CFA
8.1.1 Priors for CFA Model Parameters
8.2 Bayesian Latent Class Analysis
8.2.1 The Problem of Label-Switching and a Possible Solution
8.2.2 Comparison of VB to the EM Algorithm
8.3 Summary
9. Missing Data From a Bayesian Perspective
9.1 A Nomenclature for Missing Data
9.2 Ad Hoc Deletion Methods for Handling Missing Data
9.2.1 Listwise Deletion
9.2.2 Pairwise Deletion
9.3 Single Imputation Methods
9.3.1 Mean Imputation
9.3.2 Regression Imputation
9.3.3 Stochastic Regression Imputation
9.3.4 Hot Deck Imputation
9.3.5 Predictive Mean Matching
9.4 Bayesian Methods for Multiple Imputation
9.4.1 Data Augmentation
9.4.2 Chained Equations
9.4.3 EM Bootstrap: A Hybrid Bayesian/Frequentist Methods
9.4.4 Bayesian Bootstrap Predictive Mean Matching
9.4.5 Accounting for Imputation Model Uncertainty
9.5 Summary
10. Bayesian Variable Selection and Sparsity
10.1 Introduction
10.2 The Ridge Prior
10.3 The Lasso Prior
10.4 The Horseshoe Prior
10.5 Regularized Horseshoe Prior
10.6 Comparison of Regularization Methods
10.6.1 An Aside: The Spike-and-Slab Prior
10.7 Summary
11. Model Uncertainty
11.1 Introduction
11.2 Elements of Predictive Modeling
11.2.1 Fixing Notation and Concepts
11.2.2 Utility Functions for Evaluating Predictions
11.3 Bayesian Model Averaging
11.3.1 Statistical Specification of BMA
11.3.2 Computational Considerations
11.3.3 Markov Chain Monte Carlo Model Composition
11.3.4 Parameter and Model Priors
11.3.5 Evaluating BMA Results: Revisiting Scoring Rules
11.4 True Models, Belief Models, and M-Frameworks
11.4.1 Model Averaging in the M-Closed Framework
11.4.2 Model Averaging in the M-Complete Framework
11.4.3 Model Averaging in the M-Open Framework
11.5 Bayesian Stacking
11.5.1 Choice of Stacking Weights
11.6 Summary
12. Closing Thoughts
12.1 A Bayesian Workflow for the Social Sciences
12.2.1 Coherence
12.2.2 Conditioning on Observed Data
12.2.3 Quantifying Evidence
12.2.4 Validity
12.2.5 Flexibility in Handling Complex Data Structures
12.2.6 Formally Quantifying Uncertainty
List of Abbreviations and Acronyms
References
Author Index
Subject Index

### Biography

David Kaplan, PhD, is the Patricia Busk Professor of Quantitative Methods in the Department of Educational Psychology at the University of Wisconsin–Madison and holds affiliate appointments in the University of Wisconsin’s Department of Population Health Sciences, the Center for Demography and Ecology, and the Nelson Institute for Environmental Studies. Dr. Kaplan’s research focuses on the development of Bayesian statistical methods for education research. His work on these topics is directed toward applications to large-scale cross-sectional and longitudinal survey designs. He has been actively involved in the OECD Program for International Student Assessment (PISA), serving on its Technical Advisory Group from 2005 to 2009 and its Questionnaire Expert Group from 2004 to the present, and chairing the Questionnaire Expert Group for PISA 2015. He also serves on the Design and Analysis Committee and the Questionnaire Standing Committee for the National Assessment of Educational Progress. Dr. Kaplan is an elected member of the National Academy of Education and former chair of its Research Advisory Committee, president (2023–2024) of the Psychometric Society, and past president of the Society for Multivariate Experimental Psychology. He is a fellow of the American Psychological Association (Division 5), a former visiting fellow at the Luxembourg Institute for Social and Economic Research, a former Jeanne Griffith Fellow at the National Center for Education Statistics, and a current fellow at the Leibniz Institute for Educational Trajectories in Bamberg, Germany. He is a recipient of the Samuel J. Messick Distinguished Scientific Contributions Award from the American Psychological Association (Division 5), the Alexander von Humboldt Research Award, and the Hilldale Award for the Social Sciences from the University of Wisconsin–Madison. Dr. Kaplan was the Johann von Spix International Visiting Professor at the Universität Bamberg and the Max Kade Visiting Professor at the Universität Heidelberg, both in Germany, and is currently International Guest Professor at the Universität Heidelberg.

"This very practical book is well suited to social science students because of the examples used (large-scale surveys) and the coverage of methods that social scientists often need (latent variables, variable selection, and dealing with missing data). The book also covers some topics readers may not know they need--Bayesian model averaging and workflow, for example. Illustrations use RStan, perhaps the most flexible of programs for Bayesian modeling. Full integration of RStan input and output is provided in the text."--David Rindskopf, PhD, Distinguished Professor of Educational Psychology and Psychology, The Graduate Center, The City University of New York

"Kaplan's book is the perfect follow-up for those whose curiosity has been piqued about Bayesian statistics. The many code examples will give users a head start for applying Bayes' theorem to their data. I highly appreciate that the author uses open-source software for all models. The topics are introduced with a rich amount of background information, some equations (but never too many), detailed explanations, and code examples. Empirical results are used to illustrate each topic."--Rens van de Schoot, PhD, Department of Methodology and Statistics, Utrecht University, Netherlands

"An excellent resource for researchers at the graduate level or above with an interest in Bayesian statistics. Readers are skillfully guided through the process of statistical reasoning from a Bayesian perspective. This book is practical and minimally technical while also introducing readers to interesting historical and philosophical issues. What makes the book especially helpful is Kaplan’s careful balance of breadth and depth of coverage of key topics. In this timely second edition, important recent advances in Bayesian statistics are distilled and disseminated for researchers in the social sciences."--Sierra A. Bainter, PhD, Department of Psychology, University of Miami

"This book has all the essential components to help readers, especially quantitative researchers in social sciences, understand and conduct Bayesian modeling. The second edition includes new material on recent Markov chain Monte Carlo (MCMC) methods, such as Hamiltonian MC, in addition to a range of other updates."--Insu Paek, PhD, Senior Scientist, Human Resources Research Organization

"I recommend this book for providing a careful overview of the Bayesian framework, at a level accessible to a wide audience, with examples, code, and key references. Kaplan does a great job of covering so many different aspects of Bayesian modeling in a coherent way and presenting a number of substantive methods for analyzing complex data. I liked the comparisons and analogies to the frequentist approach."--Irini Moustaki, PhD, Department of Statistics, London School of Economics and Political Science, United Kingdom-A valuable read for researchers, practitioners, teachers, and graduate students in the field of social sciencesâ€¦.Extremely accessible and incredibly delightfulâ€¦.The wide breadth of topics covered, along with the author's clear and engaging style of writing and inclusion of numerous examples, should provide an adequate foundation for any psychologist wishing to take a leap into Bayesian thinking. Furthermore, the technical details and analytic aspects provided in all chapters should equip readers with enough knowledge to embark on Bayesian analysis with their own research data. (on the first edition)--Psychometrika, 3/1/2017