1st Edition

Analysis of Incidence Rates

ISBN 9780367152062
Published April 15, 2019 by Chapman and Hall/CRC
474 Pages 63 B/W Illustrations

USD $149.95

Prices & shipping based on shipping country


Book Description

Incidence rates are counts divided by person-time; mortality rates are a well-known example. Analysis of Incidence Rates offers a detailed discussion of the practical aspects of analyzing incidence rates. Important pitfalls and areas of controversy are discussed. The text is aimed at graduate students, researchers, and analysts in the disciplines of epidemiology, biostatistics, social sciences, economics, and psychology.


  • Compares and contrasts incidence rates with risks, odds, and hazards.
  • Shows stratified methods, including standardization, inverse-variance weighting, and Mantel-Haenszel methods
  • Describes Poisson regression methods for adjusted rate ratios and rate differences.
  • Examines linear regression for rate differences with an emphasis on common problems.
  • Gives methods for correcting confidence intervals.
  • Illustrates problems related to collapsibility.
  • Explores extensions of count models for rates, including negative binomial regression, methods for clustered data, and the analysis of longitudinal data. Also, reviews controversies and limitations.
  • Presents matched cohort methods in detail.
  • Gives marginal methods for converting adjusted rate ratios to rate differences, and vice versa.
  • Demonstrates instrumental variable methods.
  • Compares Poisson regression with the Cox proportional hazards model. Also, introduces Royston-Parmar models.
  • All data and analyses are in online Stata files which readers can download.

Peter Cummings is Professor Emeritus, Department of Epidemiology, School of Public Health, University of Washington, Seattle WA. His research was primarily in the field of injuries. He used matched cohort methods to estimate how the use of seat belts and presence of airbags were related to death in a traffic crash. He is author or co-author of over 100 peer-reviewed articles.

Table of Contents

Analysis of Incidence Rates
Peter Cummings, Emeritus Professor, Department of Epidemiology School of Public Health, University of Washington, Seattle, WA



1. Do Storks Bring Babies?
Karl Pearson and spurious correlation
Jerzy Neyman, storks, and babies
Is Poisson regression the solution to the stork problem?
Further reading

2. Risks and Rates
What is a rate?
Closed and open populations
Measures of time
Numerators for rates: counts
Numerators that may be mistaken for counts
Prevalence proportions
Denominators for rates: count denominators for incidence proportions (risks)
Denominators for rates: person-time for incidence rates
Rate numerators and denominators for recurrent events
Rate denominators other than person-time
Different incidence rates tell different stories
Potential advantages of incidence rates compared with incidence proportions (risks)
Potential advantages of incidence proportions (risks) compared with incidence rates
Limitations of risks and rates
Radioactive decay: an example of exponential decline
The relevance of exponential decay to human populations
Relationships between rates, risks, and hazards
Further reading

3. Rate Ratios and Differences
Estimated associations and causal effects
Sources of bias in estimates of causal effect
Estimation versus prediction
Ratios and differences for risks and rates
Relationships between measures of association in a closed population
The hypothetical TEXCO study
Breaking the rules: Army data for Companies A and B
Relationships between odds ratios, risk ratios, and rate ratios in case-control studies
Symmetry of measures of association
Convergence problems for estimating associations
Some history regarding the choice between ratios and differences
Other influences on the choice between use of ratios or differences
The data may sometimes be used to choose between a ratio or a difference

4. The Poisson Distribution
Alpha particle radiation
The Poisson distribution
Prussian soldiers kicked to death by horses
Variances, standard deviations, and standard errors for counts and rates
An example: mortality from Alzheimer’s disease
Large sample P-values for counts, rates, and their differences using the Wald statistic
Comparisons of rates as differences versus ratios
Large sample P-values for counts, rates, and their differences using the score statistic
Large sample confidence intervals for counts, rates, and their differences
Large sample P-values for counts, rates, and their ratios
Large sample confidence intervals for ratios of counts and rates
A constant rate based on more person-time is more precise
Exact methods
What is a Poisson process?
Simulated examples
What if the data are not from a Poisson process? Part , overdispersion
What if the data are not from a Poisson process? Part , underdispersion
Must anything be rare?
Bicyclist deaths in 2010 and 2011

5. Criticism of Incidence Rates
Florence Nightingale, William Farr, and hospital mortality rates Debate in 1864
Florence Nightingale, William Farr, and hospital mortality rates Debate in 1996-97
Criticism of rates in the British Medical Journal in 1995
Criticism of incidence rates in 2009

6. Stratified Analysis: Standardized Rates
Why standardize?
External weights from a standard population: direct standardization
Comparing directly standardized rates
Choice of the standard influences the comparison of standardized rates
Standardized comparisons versus adjusted comparisons from variance-minimizing methods
Stratified analyses
Variations on directly standardized rates
Internal weights from a population: indirect standardization
The standardized mortality ratio (SMR)
Advantages of SMRs compared with SRRs (ratios of directly standardized rates)
Disadvantages of SMRs compared with SRRs (ratios of directly standardized rates)
The terminology of direct and indirect standardization
P-values for directly standardized rates
Confidence intervals for directly standardized rates
P-values and confidence intervals for SRRs (ratios of directly standardized rates)
Large sample P-values and confidence intervals for SMRs
Small sample P-values and confidence intervals for SMRs
Standardized rates should not be used as regression outcomes
Standardization is not always the best choice

7. Stratified Analysis: Inverse-variance and Mantel-Haenszel Methods
Inverse-variance methods
Inverse-variance analysis of rate ratios
Inverse-variance analysis of rate differences
Choosing between rate ratios and differences
Mantel-Haenszel methods
Mantel-Haenszel analysis of rate ratios
Mantel-Haenszel analysis of rate differences
P-values for stratified rate ratios or differences
Analysis of sparse data
Maximum-likelihood stratified methods
Stratified methods versus regression

8. Collapsibility and Confounding
What is collapsibility?
The British X-Trial: introducing variation in risk
Rate ratios and differences are noncollapsible because exposure influences person-time
Which estimate of the rate ratio should we prefer?
Behavior of risk ratios and differences
Hazard ratios and odds ratios
Comparing risks with other outcome measures
The Italian X-Trial: -levels of risk under no exposure
The American X-Cohort study: -levels of risk in a cohort study
The Swedish X-Cohort study: a collapsible risk ratio in confounded data
A summary of findings
A different view of collapsibility
Practical implications: avoid common outcomes
Practical implications: use risks or survival functions
Practical implications: case-control studies
Practical implications: uniform risk
Practical implications: use all events

9. Poisson Regression for Rate Ratios
The Poisson regression model for rate ratios
A short comparison with ordinary linear regression
A Poisson model without variables
A Poisson regression model with one explanatory variable
The iteration log
The header information above the table of estimates
Using a generalized linear model to estimate rate ratios
An alternative parameterization for Poisson models: a regression trick
Further comments about person-time
A short summary

10. Poisson Regression for Rate Differences
A regression model for rate differences
Florida and Alaska cancer mortality: regression models that fail
Florida and Alaska cancer mortality: regression models that succeed
A generalized linear model with a power link
A caution

11. Linear Regression
Limitations of ordinary least squares linear regression
Florida and Alaska cancer mortality rates
Weighted least squares linear regression
Importance weights for weighted least squares linear regression
Comparison of Poisson, weighted least squares, and ordinary least squares regression
Exposure to a carcinogen: ordinary linear regression ignores the precision of each rate
Differences in homicide rates: simple averages versus population-weighted averages
The place of ordinary least squares linear regression for the analysis of incidence rates
Variance weighted least squares regression
Cautions regarding inverse-variance weights
Why use variance weighted least squares?
A short comparison of weighted Poisson regression, variance weighted least squares, and weighted linear regression
Problems when age-standardized rates are used as outcomes
Ratios and spurious correlation
Linear regression with ln(rate) as the outcome
Predicting negative rates

12. Model Fit
Tabular and graphic displays
Goodness of fit tests: deviance and Pearson statistics
A conditional moment chi-squared test of fit
Limitations of goodness-of-fit statistics
Measures of dispersion
Robust variance estimator as a test of fit
Comparing models using the deviance
Comparing models using Akaike and Bayesian information criterion
Example : using Stata’s generalized linear model command to decide between a rate ratio or a rate difference model for the randomized controlled trial of exercise and falls
Example : a rate ratio or a rate difference model for hypothetical data regarding the association between fall rates and age
A test of the model link
Residuals, influence analysis, and other measures
Adding model terms to improve fit
A caution

13. Adjusting Standard Errors and Confidence Intervals
Estimating the variance without regression
Poisson regression
Rescaling the variance using the Pearson dispersion statistic
Robust variance
Generalized Estimating Equations
Using the robust variance to study length of hospital stay
Computer intensive methods
The bootstrap idea
The bootstrap normal method
The bootstrap percentile method
The bootstrap bias-corrected percentile method
The bootstrap bias-corrected and accelerated method
The bootstrap-t method
Which bootstrap CI is best?
Permutation and Randomization
Randomization to nearly equal groups
Better randomization using the randomized block design of the original study
A summary

14. Storks and Babies, Revisited
Neyman’s approach to his data
Using methods for incidence rates
A model that uses the stork/women ratio

15. Flexible Treatment of Continuous Variables
The problem
Quadratic splines
Fractional polynomials
Flexible adjustment for time
Which method is best?

16. Judging Variation in Size of an Association
An example: shoes and falls
Problem : Using subgroup P-values for interpretation
Problem : Failure to include main effect terms when interaction terms are used
Problem : Incorrectly concluding that there is no variation in association
Problem : Interaction may be present on a ratio scale but not on a difference scale, and vice versa
Problem : Failure to report all subgroup estimates in an evenhanded manner

17. Negative Binomial Regression
Negative binomial regression is a random effects or mixed model
An example: accidents among workers in a munitions factory
Introducing equal person-time in the homicide data
Letting person-time vary in the homicide data
Estimating a rate ratio for the homicide data
Another example using hypothetical data for five regions
Unobserved heterogeneity
Observing heterogeneity in the shoe data
A rate difference negative binomial regression model

18. Clustered Data
Data from fictitious nursing homes
Results from , data simulations for the nursing homes
A single random set of data for the nursing homes
Variance adjustment methods
Generalized estimating equations (GEE)
Mixed model methods
What do mixed models estimate?
Mixed model estimates for the nursing home intervention
Simulation results for some mixed models
Mixed models weight observations differently than Poisson regression
Which should we prefer for clustered data, variance-adjusted or mixed models?
Additional model commands for clustered data
Further reading

19.Longitudinal Data
Just use rates
Using rates to evaluate governmental policies
Study designs for governmental policies
A fictitious water treatment and US mortality 1999-2013
Poisson regression
Population-averaged estimates (GEE)
Conditional Poisson regression, a fixed-effects approach
Negative binomial regression
Which method is best?
Water treatment in only 10 states
Conditional Poisson regression for the -state water-treatment data
A published study

20. Matched Data
Matching in case-control studies
Matching in randomized controlled trials
Matching in cohort studies
Matching to control confounding in some randomized trials and cohort studies
A benefit of matching; only matched sets with at least one outcome are needed
Studies designs that match a person to themselves
A matched analysis can account for matching ratios that are not constant
Choosing between risks and rates for the crash data in Tables 20.1 and 20.2
Stratified methods for estimating risk ratios for matched data
Odds ratios, risk ratios, cell A, and matched data
Regression analysis of matched data for the odds ratio
Regression analysis of matched data for the risk ratio
Matched analysis of rates with one outcome event
Matched analysis of rates for recurrent events
The randomized trial of exercise and falls; some problems revealed
Final words

21. Marginal Methods
What are margins?
Converting logistic regression results into risk ratios or risk differences: marginal standardization
Estimating a rate difference from a rate ratio model
Death by age and sex: a short example
Skunk bite data: a long example
Obtaining the rate difference: crude rates
Using the robust variance
Adjusting for age
Full adjustment for age and sex
Marginal commands for interactions
Marginal methods for a continuous variable
Using a rate difference model to estimate a rate ratio: use the ln scale

22. Bayesian Methods
Cancer mortality rate in Alaska
The rate ratio for falling in a trial of exercise

23. Exact Poisson Regression
A simple example
A perfectly predicted outcome
Memory problems
A caveat

24. Instrumental Variables
The problem: what does a randomized controlled trial estimate?
Analysis by treatment received may yield biased estimates of treatment effect
Using an instrumental variable
Two-stage linear regression for instrumental variables
Generalized method of moments
Generalized method of moments for rates
What does an instrumental variable analysis estimate?
There is no free lunch
Final comments

25. Hazards
Data for a hypothetical treatment with exponential survival times
Poisson regression and exponential proportional hazards regression
Poisson and Cox proportional hazards regression
Hypothetical data for a rate that changes over time
A piecewise Poisson model
A more flexible Poisson model: quadratic splines
Another flexible Poisson model: restricted cubic splines
Flexibility with fractional polynomials
When should a Poisson model be used? Randomized trial of a terrible treatment
A real randomized trial, the PLCO screening trial
What if events are common?
Cox model or a flexible parametric model?
Collapsibility and survival functions
Relaxing the assumption of proportional hazards in the Cox model
Relaxing the assumption of proportional hazards for the Poisson model
Relaxing proportional hazards for the Royston-Parmar model
The life expectancy difference or ratio
Recurrent or multiple events
A short summary

View More



Peter Cummings is Professor Emeritus, Department of Epidemiology, School of Public Health, University of Washington, Seattle WA. His research was primarily in the field of injuries. He used matched cohort methods to estimate how the use of seat belts and presence of airbags were related to death in a traffic crash. He is author or co-author of over 100 peer-reviewed articles.