With the rise of "big data," there is an increasing demand to learn the skills needed to undertake sound quantitative analysis without requiring students to spend too much time on high-level math and proofs. This book provides an efficient alternative approach, with more time devoted to the practical aspects of regression analysis and how to recognize the most common pitfalls.
By doing so, the book will better prepare readers for conducting, interpreting, and assessing regression analyses, while simultaneously making the material simpler and more enjoyable to learn. Logical and practical in approach, Regression Analysis teaches: (1) the tools for conducting regressions; (2) the concepts needed to design optimal regression models (based on avoiding the pitfalls); and (3) the proper interpretations of regressions. Furthermore, this book emphasizes honesty in research, with a prevalent lesson being that statistical significance is not the goal of research.
This book is an ideal introduction to regression analysis for anyone learning quantitative methods in the social sciences, business, medicine, and data analytics. It will also appeal to researchers and academics looking to better understand what regressions do, what their limitations are, and what they can tell us. This will be the most engaging book on regression analysis (or Econometrics) you will ever read!
A collection of author-created supplementary videos are available at: https://www.youtube.com/channel/UCenm3BWqQyXA2JRKB_QXGyw
Table of Contents
List of figures
List of tables
About the author
List of abbreviations
1. INTRODUCTION 1.1 The problem, 1.2 The purpose of research, 1.3 What causes problems in the research process? 1.4 About this book, 1.5 The most important sections in this book, 1.6 Quantitative vs. qualitative research, 1.7 Stata and R code, 1.8 Chapter summary
2. REGRESSION ANALYSIS BASICS 2.1 What is a regression?, 2.2 The four main objectives for regression analysis, 2.3 The Simple Regression Model, 2.4 How are regression lines determined?, 2.5 The explanatory power of the regression, 2.6 What contributes to slopes of regression lines?, 2.7 Using residuals to gauge relative performance, 2.8 Correlation vs. causation, 2.9 The Multiple Regression Model, 2.10 Assumptions of regression models, 2.11 Calculating standardized effects to compare estimates, 2.12 Causal effects are "average effects", 2.13 Causal effects can change over time, 2.14 A quick word on terminology for regression equations, 2.15 Definitions and key concepts, 2.16 Chapter summary
3. ESSENTIAL TOOLS FOR REGRESSION ANALYSIS 3.1 Using binary variables (how to make use of dummies), 3.2 Non-linear functional forms using OLS, 3.3 Weighted regression models, 3.4 Chapter summary
4. WHAT DOES "HOLDING OTHER FACTORS CONSTANT" MEAN? 4.1 Case studies to understand "holding other factors constant", 4.2 Using behind-the-curtains scenes to understand "holding other factors constant", 4.3 Using dummy variables to understand "holding other factors constant", 4.4 Using Venn diagrams to understand "holding other factors constant", 4.5 Could controlling for other factors take you further from the true causal effect?, 4.6 Application of "holding other factors constant" to the story of oat bran and cholesterol, 4.7 Chapter summary
5. STANDARD ERRORS, HYPOTHESIS TESTS, P-VALUES, AND ALIENS 5.1 Setting up the problem for hypothesis tests, 5.2 Hypothesis testing in regression analysis, 5.3 The drawbacks of p-values and statistical significance, 5.4 What the research on the hot hand in basketball tells us about the existence of other life in the universe, 5.5 What does an insignificant estimate tell you?, 5.6 Statistical significance is not the goal, 5.7 Chapter summary
6. WHAT COULD GO WRONG WHEN ESTIMATING CAUSAL EFFECTS? 6.1 How to judge a research study, 6.2 Exogenous (good) variation vs. endogenous (bad) variation, 6.3 Setting up the problem for estimating a causal effect, 6.4 The BIG QUESTIONS for what could bias the coefficient estimate, 6.5 How to choose the best set of control variables (model selection), 6.6 What could bias the standard errors and how do you fix it?, 6.7 What could affect the validity of the sample?, 6.8 What model diagnostics should you do?, 6.9 Make sure your regression analyses/interpretations do no harm, 6.10 Applying the BIG QUESTIONS to studies on estimating divorce effects on children, 6.11 Applying the BIG QUESTIONS to nutritional studies, 6.12 Chapter summary: a review of the BIG QUESTIONS
7. STRATEGIES FOR OTHER REGRESSION OBJECTIVES 7.1 Strategies for forecasting/predicting an outcome, 7.2 Strategies for determining predictors of an outcome, 7.3 Strategies for adjusting outcomes for various factors, 7.4 Summary of the strategies for each regression objective
8. METHODS TO ADDRESS BIASES 8.1 Fixed effects, 8.2 A thorough example of fixed effects, 8.3 An alternative to the fixed-effects estimator, 8.4 Random effects, 8.5 First differences, 8.6 Difference in Differences, 8.7 Two-stage least squares (instrumental variables), 8.8 Regression discontinuities, 8.9 Case study: research on how divorce affects children, 8.10 Knowing when to punt, 8.11 Chapter summary
9. OTHER METHODS BESIDES ORDINARY LEAST SQUARES 9.1 Types of outcome variables, 9.2 Dichotomous outcomes, 9.3 Ordinal outcomes – ordered models, 9.4 Categorical outcomes – Multinomial Logit Model, 9.5 Censored outcomes – Tobit models, 9.6 Count variables – Negative Binomial and Poisson models, 9.7 Duration models, 9.8 Chapter summary
10. TIME-SERIES MODELS 10.1 The components of a time-series variable, 10.2 Autocorrelation, 10.3 Autoregressive models, 10.4 Distributed-lag models, 10.5 Consequences of and tests for autocorrelation, 10.6 Stationarity, 10.7 Vector Autoregression, 10.8 Forecasting with time series, 10.9 Chapter summary
11. SOME REALLY INTERESTING RESEARCH 11.1 Can discrimination be a self-fulfilling prophecy? 11.2 Does Medicaid participation improve health outcomes?, 11.3 Estimating peer effects for academic outcomes, 11.4 How much does a GED improve labor-market outcomes?
12. HOW TO CONDUCT A RESEARCH PROJECT 12.1 Choosing a topic, 12.2 Conducting the empirical part of the study, 12.3 Writing the report
13. SUMMARIZING THOUGHTS 13.1 Be aware of your cognitive biases, 13.2 What betrays trust in published studies, 13.3 How to do a referee report responsibly, 13.4 Summary of the most important points and interpretations, 13.5 Final words of wisdom
APPENDIX OF BACKGROUND STATISTICAL TOOLS A.1 Random variables and probability distributions, A.2 The normal distribution and other important distributions, A.3 Sampling distributions, A.4 Desired properties of estimators
Jeremy Arkes is Associate Professor at the Graduate School of Business and Public Policy, Naval Postgraduate School, U.S.A. He conducts research in a variety of fields, with a focus on military-manpower policy, substance-use policy, determinants of youth outcomes, sports economics, and using sports outcomes to make inferences on human behavior.
- Exercise CSV (ZIP 926.9KB)
- Exercise Stata files (ZIP 919KB)
- Exercises data set descriptions (PDF 182.3KB)
- R CODE for text (PF 426.2KB)
- STATA CODE for text ((PDF 311.3KB)
- Text CSV files (ZIP 119.4KB)
- Text data set descriptions (PDF 188.2KB)
- Text Stata files (ZIP 125.5KB)
- Supplement to Chapter 6 (PDF 214.3KB)
- Supplement to omitted variables (PDF 117.4KB)