1st Edition

R Companion for Sampling Design and Analysis, Third Edition

By Yan Lu, Sharon L. Lohr Copyright 2022
    222 Pages 24 B/W Illustrations
    by Chapman & Hall

    222 Pages 24 B/W Illustrations
    by Chapman & Hall

    222 Pages 24 B/W Illustrations
    by Chapman & Hall

    The R Companion for Sampling: Design and Analysis, designed to be read alongside Sampling: Design and Analysis, Third Edition by Sharon L. Lohr (SDA; 2022, CRC Press), shows how to use functions in base R and contributed packages to perform calculations for the examples in SDA.

    No prior experience with R is needed. Chapter 1 tells you how to obtain R and RStudio, introduces basic features of the R statistical software environment, and helps you get started with analyzing data.

    Each subsequent chapter provides step-by-step guidance for working through the data examples in the corresponding chapter of SDA, with code, output, and interpretation. Tips and warnings help you develop good programming practices and avoid common survey data analysis errors.

    R features and functions are introduced as they are needed so you can see how each type of sample is selected and analyzed. Each chapter builds on the knowledge developed earlier for simpler designs; after finishing the book, you will know how to use R to select and analyze almost any type of probability sample.

    All R code and data sets used in this book are available online to help you develop your skills analyzing survey data from social and public opinion research, public health, crime, education, business, agriculture, and ecology.

    1. Getting Started
    2. Obtaining the Software

      Installing R packages

      R Basics

      Reading Data into R

      Saving Output

      Integrating R Output into LATEX Documents

      Missing Data

      Summary, Tips, and Warnings

    3. Simple Probability Samples
    4. Selecting a Simple Random Sample

      Computing Statistics from an SRS

      Additional Code for Exercises

      Summary, Tips, and Warnings

    5. Stratified Sampling
    6. Allocation Methods

      Selecting a Stratified Random Sample

      Computing Statistics from a Stratified Random Sample

      Estimating Proportions from a Stratified Random Sample

      Additional Code for Exercises

      Summary, Tips, and Warnings

    7. Ratio and Regression Estimation
    8. Ratio Estimation

      Regression Estimation

      Domain Estimation


      Ratio Estimation with Stratified Sampling

      Model-Based Ratio and Regression Estimation

      Summary, Tips, and Warnings

    9. Cluster Sampling with Equal Probabilities
    10. Estimates from One-Stage Cluster Samples

      Estimates from Multi-Stage Cluster Samples

      Model-Based Design and Analysis for Cluster Samples

      Additional Code for Exercises

      Summary, Tips, and Warnings

    11. Sampling with Unequal Probabilities
    12. Selecting a Sample with Unequal Probabilities

      Sampling With Replacement

      Sampling Without Replacement

      Selecting a Two-stage Cluster Sample

      Computing Estimates from an Unequal-Probability Sample

      Estimates from With-Replacement Samples

      Estimates from Without-Replacement Samples

      Summary, Tips, and Warnings

    13. Complex Surveys
    14. Selecting a Stratified Two-Stage Sample

      Estimating Quantiles

      Computing Estimates from Stratified Multistage Samples

      Univariate Plots from Complex Surveys

      Scatterplots from Complex Surveys

      Additional Code for Exercises

      Summary, Tips, and Warnings

    15. Nonresponse
    16. How R Functions Treat Missing Data

      Poststratification and Raking


      Summary, Tips, and Warnings

    17. Variance Estimation in Complex Surveys
    18. Replicate Samples and Random Groups

      Constructing Replicate Weights

      Balanced Repeated Replication



      Replicate Weights and Nonresponse Adjustments

      Using Replicate Weights from a Survey Data File

      Summary, Tips, and Warnings

    19. Categorical Data Analysis in Complex Surveys
    20. Contingency Tables and Odds Ratios

      Chi-Square Tests

      Loglinear Models

      Summary, Tips, and Warnings

    21. Regression with Complex Survey Data
    22. Straight Line Regression in an SRS

      Linear Regression for Complex Survey Data

      Multiple Linear Regression

      Using Regression to Compare Domain Means

      Logistic Regression

      Additional Resources and Code

      Summary, Tips, and Warnings

    23. Additional Topics for Survey Data Analysis

    Two-Phase Sampling

    Contents iii

    Estimating the Size of a Population

    Ratio Estimation of Population Size

    Loglinear Models with Multiple Lists

    Small Area Estimation


    A Data Set Descriptions




    Yan Lu is Associate Professor of Statistics at the University of New Mexico. Her research interests include survey sampling, mixed models, nonparametric regression, and data mining. Recent publications develop new statistical methods for combining data from multiple surveys, selecting probability samples from massive data streams, and applying nonparametric regression to survey data.

    Sharon L. Lohr, the author of Measuring Crime: Behind the Statistics, has published widely about survey sampling and statistical methods for education, public policy, law, and crime. She is a Fellow of the American Statistical Association and an elected member of the International Statistical Institute, and has received the Gertrude M. Cox, Morris Hansen, and Deming Awards. Formerly Dean’s Distinguished Professor of Statistics at Arizona State University and a Vice President at Westat, she is now a statistical consultant and writer.