Statistical Thinking in Epidemiology (Hardback) book cover

Statistical Thinking in Epidemiology

By Yu-Kang Tu, Mark S. Gilthorpe

© 2011 – Chapman and Hall/CRC

231 pages | 52 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781420099911
pub: 2011-07-27
eBook (VitalSource) : 9781420099928
pub: 2016-04-19
from $47.23

FREE Standard Shipping!
e–Inspection Copy


While biomedical researchers may be able to follow instructions in the manuals accompanying the statistical software packages, they do not always have sufficient knowledge to choose the appropriate statistical methods and correctly interpret their results. Statistical Thinking in Epidemiology examines common methodological and statistical problems in the use of correlation and regression in medical and epidemiological research: mathematical coupling, regression to the mean, collinearity, the reversal paradox, and statistical interaction.

Statistical Thinking in Epidemiology is about thinking statistically when looking at problems in epidemiology. The authors focus on several methods and look at them in detail: specific examples in epidemiology illustrate how different model specifications can imply different causal relationships amongst variables, and model interpretation is undertaken with appropriate consideration of the context of implicit or explicit causal relationships. This book is intended for applied statisticians and epidemiologists, but can also be very useful for clinical and applied health researchers who want to have a better understanding of statistical thinking.

Throughout the book, statistical software packages R and Stata are used for general statistical modeling, and Amos and Mplus are used for structural equation modeling.


"… this book is enjoyable … it encourages readers to conceptualize statistical thinking in a graphically entertaining way. … one of the impressive works of the book lies in visualization of statistically important concepts.… I would recommend this book to diverse audiences. … the book provides novel insight on how one can develop the core concepts from scratch via graphical concepts, which will definitely be beneficial. Bearing in mind the geometrical concepts from this book, statistical thinking of more complicated models is readily welcomed."

Journal of Agricultural, Biological, and Environmental Statistics, Volume 20, Number 2, 2015

"There are extensive references to the literature, both in statistics and in medicine. This is a demanding text, not mathematically but for the subtlety of the issues canvassed, some of which remain controversial. Should any reader come to this text thinking that the interpretation of regression results is a simple matter, they will be quickly disabused."

International Statistical Review, 2013

"The graphical explanations proposed are quite convincing and these tools should be more exploited in statistical classes."

—Sophie Donnet, Université Paris-Dauphine, CHANCE, 25.4

Table of Contents


Uses of Statistics in Medicine and Epidemiology

Structure and Objectives of This Book

Nomenclature in This Book


Vector Geometry of Linear Models for Epidemiologists


Basic Concepts of Vector Geometry in Statistics

Correlation and Simple Regression in Vector Geometry

Linear Multiple Regression in Vector Geometry

Significance Testing of Correlation and Simple Regression in Vector Geometry

Significance Testing of Multiple Regression in Vector Geometry


Path Diagrams and Directed Acyclic Graphs


Path Diagrams

Directed Acyclic Graphs

Direct and Indirect Effects


Mathematical Coupling and Regression to the Mean in the Relation between Change and Initial Value


Historical Background

Why Should Change Not Be Regressed on Initial Value? A Review of the Problem

Proposed Solutions in the Literature

Comparison between Oldham’s Method and Blomqvist’s Formula

Oldham’s Method and Blomqvist’s Formula Answer Two Different Questions

What Is Galton’s Regression to the Mean?

Testing the Correct Null Hypothesis

Evaluation of the Categorisation Approach

Testing the Relation between Changes and Initial Values When There Are More than Two Occasions


Analysis of Change in Pre-/Post-Test Studies


Analysis of Change in Randomised Controlled Trials

Comparison of Six Methods

Analysis of Change in Non-Experimental Studies: Lord’s Paradox

ANCOVA and t-Test for Change Scores Have Different Assumptions


Collinearity and Multicollinearity

Introduction: Problems of Collinearity in Linear Regression



Mathematical Coupling and Collinearity

Vector Geometry of Collinearity

Geometrical Illustration of Principal Components Analysis as a Solution to Multicollinearity

Example: Mineral Loss in Patients Receiving Parenteral Nutrition

Solutions to Collinearity


Is ‘Reversal Paradox’ a Paradox?

A Plethora of Paradoxes: The Reversal Paradox

Background: The Foetal Origins of Adult Disease

Hypothesis (Barker’s Hypothesis)

Vector Geometry of the Foetal Origins Hypothesis

Reversal Paradox and Adjustment for Current Body Size: Empirical Evidence from Meta-Analysis



Testing Statistical Interaction

Introduction: Testing Interactions in Epidemiological Research

Testing Statistical Interaction between Categorical Variables

Testing Statistical Interaction between Continuous Variables

Partial Regression Coefficient for Product Term in Regression Models

Categorization of Continuous Explanatory Variables

The Four-Model Principle in the Foetal Origins Hypothesis

Categorization of Continuous Covariates and Testing Interaction



Finding Growth Trajectories in Lifecourse Research


Current Approaches to Identifying Postnatal Growth Trajectories in Lifecourse Research


Partial Least Squares Regression for Lifecourse Research



OLS Regression

PLS Regression



Concluding Remarks



About the Authors

Dr Yu-Kang Tu is a Senior Clinical Research Fellow in the Division of Biostatistics, School of Medicine, and in the Leeds Dental Institute, University of Leeds, Leeds, UK. He was a visiting Associate Professor to the National Taiwan University, Taipei, Taiwan. First trained as a dentist and then an epidemiologist, he has published extensively in dental, medical, epidemiological and statistical journals. He is interested in developing statistical methodologies to solve statistical and methodological problems such as mathematical coupling, regression to the mean, collinearity and the reversal paradox. His current research focuses on applying latent variables methods, e.g. structural equation modeling, latent growth curve modelling, and lifecourse epidemiology. More recently, he has been working on applying partial least squares regression to epidemiological data.

Prof Mark S Gilthorpe is professor of Statistical Epidemiology, Division of Biostatistics, School of Medicine, University of Leeds, Leeds, UK. Having completed a single honours degree in mathematical Physics (University of Nottingham), he undertook a PhD in Mathematical Modelling (University of Aston in Birmingham), before initially embarking upon a career as self-employed Systems and Data Analyst and Computer Programmer, and eventually becoming an academic in biomedicine. Academic posts include systems and data analyst of UK regional routine hospital data in the Department of Public Health and Epidemiology, University of Birmingham; Head of Biostatistics at the Eastman Dental Institute, University College London; and founder and Head of the Division of Biostatistics, School of Medicine, University of Leeds. His research focus has persistently been that of the development and promotion of robust and sophisticated modelling methodologies for non-experimental (and sometimes large and complex) observational data within biomedicine, leading to extensive publications in dental, medical, epidemiological and statistical journals.

Subject Categories

BISAC Subject Codes/Headings:
MATHEMATICS / Probability & Statistics / General
MEDICAL / Epidemiology