1st Edition

# Replication and Evidence Factors in Observational Studies

**Also available as eBook on:**

Outside of randomized experiments, association does not imply causation, and yet there is nothing defective about our knowledge that smoking causes lung cancer, a conclusion reached in the absence of randomized experimentation with humans. How is that possible? If observed associations do not identify causal effects in observational studies, how can a sequence of such associations become decisive?

Two or more associations may each be susceptible to unmeasured biases, yet not susceptible to the same biases. An observational study has two evidence factors if it provides two comparisons susceptible to different biases that may be combined as if from independent studies of different data by different investigators, despite using the same data twice. If the two factors concur, then they may exhibit greater insensitivity to unmeasured biases than either factor exhibits on its own.

**Replication and Evidence Factors in Observational Studies** includes four parts:

- A concise introduction to causal inference, making the book self-contained
- Practical examples of evidence factors from the health and social sciences with analyses in R
- The theory of evidence factors
- Study design with evidence factors

A companion R package evident is available from CRAN.

**I Background: Aspects of Causal Inference**

**1. Causal Inference in Randomized Experiments**

A Randomized Experiment

Structure and Notation

Covariates and Outcomes

Causal Effects with Two Treatment Groups

Inference with Random Assignment

Randomization Tests for Continuous Outcomes

Confidence Sets for Causal Effects

The General Situation

Summary: Randomization Simplifies Causal Inference

Using R

2. Causal Inference in Observational Studies

How Are Observational Studies Different From Experiments?

Sensitivity Analysis

*Another Example of Sensitivity Analysis

Design Sensitivity

Summary: Biased Treatment Assignment

Using R

Exercises

3. Replication and its Limits

Biases Can Replicate

Some Perspectives

Replications that Disrupt Some Potential Biases

Instruments and Replication

Summary: Replication is Not Repetition

II Evidence Factors in Practice

4. Examples of Studies with Evidence Factors

Smoking and Periodontal Disease

Antineoplastic Drugs and DNA Damage

Lead Absorption in Children

Minimum Wages and Employment

Benzene and Chromosome Aberrations

Summary: Mutually Supporting, Unrelated Comparisons

Using R

Exercises

5. Simple Analyses with Evidence Factors

Structure of the Simple Analyses

Antineoplastic Drugs and DNA Damage

Smoking and Periodontal Disease

Factors that Do Not Concur

Summary: Strengthen Evidence of Cause and Effect

Using R

Exercises** **

6. Planned Analyses with Evidence Factors

Closed Testing with Three Factors

Confidence Intervals for Magnitudes of Effect

Evidence Factors Plus An Incompatible Comparison

Summary: Planned Analyses Can Accomplish More

Using R

Exercises

III Theory of Evidence Factors

7. Dependent P-Values

Dependent P-values Larger than Uniform

Jointly Larger Than Uniform

Creating Jointly Valid, Possibly Dependent P-values

Combining Jointly Valid, Possibly Dependent P-values

Summary: Dependent P-values Jointly Larger than Uniform

Exercises** **

8. Treatment Assignments as Permutations

Formalizing Intuition About Unrelated Pieces of Evidence

Individuals, Strata and Treatment Positions

Permutation Matrices

Pick Matrices

Direct Sums of Permutation Matrices

Subpick Matrices

Treatments with Doses

Permuting Strata of the Same Size

Permuting Several Permutation Matrices

Doing Several Things at Once

Summary: A Treatment Assignment is a Permutation

Complement: Split Matrices

Exercises

9. Sets of Treatment Assignments

Sets of Permutation Matrices

Products of Sets

Unique Representation As a Product of Two Factors

*Closure

Summary: Factoring Sets of Treatment Assignments

Exercises** **

10.Probability Distributions for Treatment Assignments

One Distribution

A Set of Distributions

*Some Technical Remarks and Definitions

Sensitivity Analysis

Summary: Probability on a Set of Treatment Assignments

Complement: Sensitivity Analysis with Doses

Exercises

11.Factors

Marginal and Conditional Distributions

Joint Distribution of Two Sensitivity Analyses

Sets of Marginal and Conditional Distributions

Ignoring a Factor

Conditioning on a Factor

Combining Two Sensitivity Analyses

Summary: Combining Two Sensitivity Analyses

Complement: More Than Two Factors

Exercises

12.*Groups of Permutation Matrices

Why Groups?

Groups

Groups in Evidence Factors: Some Examples

Group Products

Summary: Groups Provide the Needed Factors

IV Aspects of Design

13.Constructing Matched Samples with Evidence Factors

Aspects of Design

Nearly Optimal Complete Blocks

Optimal Incomplete Block Designs

Variation in Treatment Within and Between Institutions

Comparing Study Designs: Which Design is Best?

Summary: Build Evidence Factors into the Design

Using R

14.Design Elements for Evidence Factors

Some Common Design Elements

*Symmetric Sets of Biases

### Biography

**Author**

**Paul R. Rosenbaum** is the Robert G. Putzel Professor of Statistics at the Wharton School of the University of Pennsylvania. For contributions to causal inference, he received the R. A. Fisher Award in 2019 and the George W. Snedecor Award in 2003, both from the Committee of Presidents of Statistical Societies (COPSS). He delivered an IMS Medallion Lecture on the topic of this book in 2020. Dr. Rosenbaum is the author of several other books including *Observational Studies* (Springer 1995, 2002), *Design of Observational Studies* (Springer 2010, 2020), and *Observation and Experiment: An Introduction to Causal Inference* (Harvard University Press 2017).

"In summary, this book provides clear descriptions to explain analysis of evidence factors in observational studies. In addition, the author also provides the rigorous theory to support the validity of methodologies in this book. For the implementation of methodologies, the author develops a package ‘evident’, which makes readers reproduce and implement the methods easily. In general, this book is an amazing reference for those who are interested in causal inference or observational studies."

-Li-Pang Chen inJournal of the Royal Statistical Society SeriesA,March 2022"(...) the book sets high standards for the analysis of those observational studies that fit within its purview. A wide range of examples and associated data are discussed, all with important public health implications. ... As the author suggests, practically minded readers who skip the detailed mathematics can nevertheless gain important insights by following the motivation, applications and examples. The issues raised and points made are important whenever associations found in observational data are used as a basis for claims of causation..."

-John H. Maindonald inInternational Statistical Review,March 2022

"Overall, I consider the book to be a rich resource for introducing this relatively new yet highly impactful area of research. The book is organized into four Sections. … Section II sets the tone for the rest of the book by collecting carefully chosen examples. … Chapter 4 provides five real studies to elicit aspects of evidence factors from a variety of representative examples. … The chapter is supplemented through R codes for the examples covered and hands-on exercises for the interested reader. Concepts are well elucidated through concrete running examples. … Section III of the book lends a mathematically rigorous lens to the intuitions gathered from the data analyses and numerical examples in Section II. … This logic is beautifully explained.”

-Rajarshi Mukherjee inBiometrics,June 2021"This book is the first to discuss evidence factors and is a valuable contribution. Statisticians working on observational studies would find the book useful. Empirical researchers who conduct observational studies would find Chapters 1-6 useful. I would say the book serves more as a reference than a textbook although the book is as lucidly written as any good textbook…I would strongly recommend publication. The book will be of wide interest to causal inference practitioners."

-Ted Westling, University of Massachusetts, Amherst"This book will be of wide interest to causal inference practitioners."

-Joel Greenhouse, Carnegie Mellon University"This book not only brings much of the discussion around the topic of replicability in causal inference in one place, it does it in a way accessible to most. I would absolutely recommend this book for publication. (i) A big strength is that the book is conscious of the balance it needs to keep among motivating the concept, providing technical exposure and demonstrating the application of the method. (ii) This book is self-contained. (iii) The R codes in the footnotes and more references to specific R packages to implement the methods is a huge plus for the book. (iv) Of course, more on this topic exists that is not covered in the book. This book gives necessary references to papers for curious readers. There is a clear distinction of the focus of chapters through 6 and the latter chapters… While the earlier chapters consider Evidence Factors in Practice, the later chapters are about the Theory of Evidence Factors. This distinction is important to illustrate ideas. It is also nice that the book rounds up the discussion at the end in Chapter 13 with many practical tools…This book not only brings much of the discussion around the topic of replicability in causal inference in one place, it does it in a way accessible to most. I would absolutely recommend this book for publication."

-Bikram Karmakar, University of Florida"Paul Rosenbaum is a gifted expositor of complex statistical concepts and methods. His books on analyzing data from observational studies are not only a pleasure to read and to learn from but are scholarly and erudite in ways that are not typical of writings in statistics…The proposed manuscript is in the same style as Rosenbaum’s earlier books and therefore promises to be popular as a reference for research workers or as a textbook for advanced undergraduate or graduate students, i.e., readers with sufficient statistical maturity. There is a lot of conceptual and technical machinery required to understand and use statistical methods for causal inference. In this book Rosenbaum is taking a step back. His goal is to explicate the informal steps that lead to a consensus about a causal relationship in practice and to provide formal methods for interrogating and weighing evidence from studies to help the scientific community reach consensus about causal relationships. This book will be a valuable addition to the causal inference literature."

-Dylan Small, University of Pennsylvania