Regression Models as a Tool in Medical Research  book cover
1st Edition

Regression Models as a Tool in Medical Research

ISBN 9781466517486
Published November 27, 2012 by Chapman and Hall/CRC
496 Pages 158 B/W Illustrations

FREE Standard Shipping
USD $115.00

Prices & shipping based on shipping country


Book Description

While regression models have become standard tools in medical research, understanding how to properly apply the models and interpret the results is often challenging for beginners. Regression Models as a Tool in Medical Research presents the fundamental concepts and important aspects of regression models most commonly used in medical research, including the classical regression model for continuous outcomes, the logistic regression model for binary outcomes, and the Cox proportional hazards model for survival data. The text emphasizes adequate use, correct interpretation of results, appropriate presentation of results, and avoidance of potential pitfalls.

After reviewing popular models and basic methods, the book focuses on advanced topics and techniques. It considers the comparison of regression coefficients, the selection of covariates, the modeling of nonlinear and nonadditive effects, and the analysis of clustered and longitudinal data, highlighting the impact of selection mechanisms, measurement error, and incomplete covariate data. The text then covers the use of regression models to construct risk scores and predictors. It also gives an overview of more specific regression models and their applications as well as alternatives to regression modeling. The mathematical details underlying the estimation and inference techniques are provided in the appendices.

Table of Contents

Why Use Regression Models?
Why using simple regression models?
Why using multiple regression models?
Some basic notation

An Introductory Example
A single line model
Fitting a single line model
Taking uncertainty into account
A two lines model
How to perform these steps with Stata
Exercise 5-HIAA and serotonin
Exercise Haemoglobin
Exercise Scaling of variables

The Classical Multiple Regression Model

Adjusted Effects
Adjusting for confounding
Adjusting for imbalances
Exercise Physical activity in school children

Inference for the Classical Multiple Regression Model
The traditional and the modern way of inference
How to perform the modern way of inference with Stata
How valid and good are least squares estimates?
A note on the use and interpretation of p-values in regression analyses

Logistic Regression
The definition of the logistic regression model
Analyzing a dose response experiment by logistic regression
How to fit a dose response model with Stata
Estimating odds ratios and adjusted odds ratios using logistic regression
How to compute (adjusted) odds ratios using logistic regression in Stata
Exercise Allergy in children
More on logit scale and odds scale

Inference for the Logistic Regression Model
The maximum likelihood principle
Properties of the ML estimates for logistic regression
Inference for a single regression parameter
How to perform Wald tests and likelihood ratio tests in Stata

Categorical Covariates
Incorporating categorical covariates in a regression model
Some technicalities in using categorical covariates
Testing the effect of a categorical covariate
The handling of categorical covariates in Stata
Presenting results of a regression analysis involving categorical covariates in a table
Exercise Physical occupation and back pain
Exercise Odds ratios and categorical covariates

Handling Ordered Categories: A First Lesson in Regression Modeling Strategies

The Cox Proportional Hazard Model
Modeling the risk of dying
Modeling the risk of dying in continuous time
Using the Cox proportional hazards model to quantify the difference in survival between groups
How to fit a Cox proportional hazards model with Stata
Exercise Prognostic factors in breast cancer patients – Part 1

Common Pitfalls in Using Regression Models
Association vs. causation
Difference between subjects vs. difference within subjects
Real world models vs. statistical models
Relevance vs. significance
Exercise Prognostic factors in breast cancer patients – Part 2

Some Useful Technicalities
Illustrating models by using model based predictions
How to work with predictions in Stata
Residuals and the standard deviation of the error term
Working with residuals and the RMSE in Stata
Linear and nonlinear functions of regression parameters
Transformations of regression parameters
Centering of covariate values
Exercise Paternal smoking vs. maternal smoking

Comparing Regression Coefficients
Comparing regression coefficients among continuous covariates
Comparing regression coefficients among binary covariates
Measuring the impact of changing covariate values
Translating regression coefficients
How to compare regression coefficients in Stata
Exercise Health in young people

Power and Sample Size
The power of a regression analysis
Determinants of power in regression models with a single covariate
Determinants of power in regression models with several covariates
Power and sample size calculations when a sample from the covariate distribution is given
Power and sample size calculations given a sample from the covariate distribution with Stata
The choice of the values of the regression parameters in a simulation study
Simulating a covariate distribution
Simulating a covariate distribution with Stata
Choosing the parameters to simulate a covariate distribution
Necessary sample sizes to justify asymptotic methods
Exercise Power considerations for a study on neck pain
Exercise Choosing between two outcomes

The Selection of the Sample
Selection in dependence on the covariates
Selection in dependence on the outcome
Sampling in dependence on covariate values

The Selection of Covariates
Fitting regression models with correlated covariates
The "Adjustment vs. power" dilemma
The "Adjustment makes effects small" dilemma
Adjusting for mediators
Adjusting for confounding - A useful academic game
Adjusting for correlated confounders
Including predictive covariates
Automatic variable selection
How to choose relevant sets of covariates
Preparing the selection of covariates: Analyzing the association among covariates
Preparing the selection of covariates: Univariate analyses?
Exercise Vocabulary size in young children – Part 1
Preprocessing of the covariate space
How to preprocess the covariate space with Stata
Exercise Vocabulary size in young children – Part 2
What is a confounder?

Modeling Nonlinear Effects
Quadratic regression
Polynomial regression
Fractional Polynomials
Gain in power by modeling nonlinear effects?
Demonstrating the effect of a covariate
Demonstrating a nonlinear effect
Describing the shape of a nonlinear effect
Detecting nonlinearity by analysis of residuals
Judging of nonlinearity may require adjustment
How to model nonlinear effects in Stata
The impact of ignoring nonlinearity
Modeling the nonlinear effect of confounders
Nonlinear models
Exercise Serum markers for AMI

Transformation of Covariates
Transformations to obtain a linear relationship
Transformation of skewed covariates
To categorize or not to categorize

Effect Modification and Interactions
Modeling effect modification
Adjusted effect modifications
Modeling effect modifications in several covariates
The effect of a covariate in the presence of interactions
Interactions as deviations from additivity
Scales and interactions
Ceiling effects and interactions
Hunting for interactions
How to analyze effect modification and interactions with Stata
Exercise Treatment interactions in a randomized clinical trial for the treatment of malignant glioma

Applying Regression Models to Clustered Data
Why clustered data can invalidate inference
Robust standard errors
Improving the efficiency
Within and between cluster effects
Some unusual but useful usages of robust standard errors in clustered data
How to take clustering into account in Stata

Applying Regression Models to Longitudinal Data
Analyzing time trends in the outcome
Analyzing time trends in the effect of covariates
Analyzing the effect of covariates
Analyzing individual variation in time trends
Analyzing summary measures
Analyzing the effect of change
How to perform regression modeling of longitudinal data in Stata
Exercise Increase of body fat in adolescents

The Impact of Measurement Error
The impact of systematic and random measurement error
The impact of misclassification
The impact of measurement error in confounders
The impact of differential misclassification and measurement error
Studying the measurement error
Exercise Measurement error and interactions

The Impact of Incomplete Covariate Data
Missing value mechanisms
Properties of a complete case analysis
Bias due to using ad hoc methods
Advanced techniques to handle incomplete covariate data
Handling of partially defined covariates

Risk Scores
What is a risk score?
Judging the usefulness of a risk score
The precision of risk score values
The overall precision of a risk score
Using Stata’s predict command to compute risk scores
Categorization of risk scores
Exercise Computing risk scores for breast cancer patients

Construction of Predictors
From risk scores to predictors
Predictions and prediction intervals for a continuous outcome
Predictions for a binary outcome
Construction of predictions for time to event data
How to construct predictions with Stata
The overall precision of a predictor

Evaluating the Predictive Performance
The predictive performance of an existing predictor
How to assess the predictive performance of an existing predictor in Stata
Estimating the predictive performance of a new predictor
How to assess the predictive performance via cross validation in Stata
Exercise Assessing the predictive performance of a prognostic score in breast cancer patients

Outlook: Construction of Parsimonious Predictors

Alternatives to Regression Modeling
Measures of association: Correlation coefficients
Measures of association: The odds ratio
Propensity scores
Classification and regression trees

Specific Regression Models
Probit regression for binary outcomes
Generalized linear models
Regression models for count data
Regression models for ordinal outcome data
Quantile regression and robust regression
ANOVA and regression

Specific Usages of Regression Models
Logistic regression for the analysis of case control studies
Logistic regression for the analysis of matched case control studies
Adjusting for baseline values in randomized clinical trials
Assessing predictive factors
Incorporating time varying covariates in a Cox model
Time dependent effects in a Cox model
Using the Cox model in the presence of competing risks
Using the Cox model to analyze multi state models

What Is a Good Model?
Does the model fit the data?
How good are predictions?
Explained variation
Goodness of fit
Model stability
The usefulness of a model

Final Remarks on the Role of Prespecified Models and Model Development

Mathematics behind the Classical Linear Regression Model
Computing regression parameters in simple linear regression
Computing regression parameters in the classical multiple regression model
Estimation of the standard error
Construction of confidence intervals and p-values

Mathematics behind the Logistic Regression Model
The least squares principle as a maximum likelihood principle
Maximizing the likelihood of a logistic regression model
Estimating the standard error of the ML estimates
Testing composite hypotheses

The Modern Way of Inference
Robust estimation of standard errors
Robust estimation of standard errors in the presence of clustering

Mathematics for Risk Scores and Predictors
Computing individual survival probabilities after fitting a Cox model
Standard errors for risk scores
The delta rule



View More



Werner Vach is a professor of medical informatics and clinical epidemiology at the University of Freiburg. Dr. Vach has co-authored more than 150 publications in medical journals. His research encompasses biostatistics methodology in the areas of incomplete covariate data, prognostic studies, diagnostic studies, and agreement studies.


"The book can be recommended as a useful overview of practical aspects of regression modeling, very suitable for medical researchers who want to apply statistical methods or do apply them already now. It is also very suitable for students of statistics and their teachers."
ISCB News, 59, June 2015

"With its focus on conceptual understanding and practical applications, this book is highly recommended to medical and other health science researchers who desire to improve their understanding of regression analysis for a better understanding of medical literature, for the adequate presentation of their own regression outcomes, or for improved interpretation of their results for publications and presentations. … Additionally, this book can serve as supplemental reading for an applied graduate level course on general regression models."
—Journal of Agricultural, Biological, and Environmental Statistics

"The book can be a very helpful contribution especially for researchers in medical sciences when performing their statistical analyses and trying to interpret the results obtained. … This book provides plenty of practical knowledge about these basic models and also some of their extensions that is often not easy to find from statistical textbooks or from software manuals. The basic methods are well explained and illustrated by numerous practical examples, mainly using simulated datasets."
—Tapio Nummi, International Statistical Review