Skip to main content

1st Edition

Medical Risk Prediction Models With Ties to Machine Learning

By Thomas A. Gerds, Michael W. Kattan Copyright 2022

312 Pages

by Chapman & Hall

312 Pages

by Chapman & Hall

312 Pages

by Chapman & Hall

Also available as eBook on:

Taylor & Francis eBooks
(Institutional Purchase)Opens in new tab or window

Description

Medical Risk Prediction Models: With Ties to Machine Learning is a hands-on book for clinicians, epidemiologists, and professional statisticians who need to make or evaluate a statistical prediction model based on data. The subject of the book is the patient’s individualized probability of a medical event within a given time horizon. Gerds and Kattan describe the mathematical details of making... Read more

Table of Contents

Software

Why should I care about statistical prediction models?

The many uses of prediction models in medicine

The unique messages of this book

Prognostic factor modeling philosophy

The rest of this book

I am going to make a prediction model What do I need to know?

Prediction model framework

Target population

The time origin

The event of interest

The prediction time horizon and follow-up

Landmarking

Risks and risk predictions

Classification of risk

Predictor variables

Checklist

Prediction performance

Proper scoring rules

Calibration

Discrimination

Explained variation

Variability and uncertainty

The interpretation is relative

Utility

Average versus subgroups

Study design

Study design and sources of information

Cohort

Multi-center study

Randomized clinical trial

Case-control

Given treatment and treatment options

Sample size calculation

Data

Purpose dataset

Data dictionary

Measurement error

Missing values

Censored data

Competing risks

Modeling

Risk prediction model

Risk classifier

How is prediction modeling different from statistical inference?

Regression model

Linear predictor

Expert selects the candidate predictors

How to select variables for inclusion in the final model

All possible interactions

Checklist

Machine learning

Validation

The conventional model

Internal and external validation

Conditional versus expected performance

Cross-validation

Data splitting

Bootstrap

Model checking and goodness of fit

Reproducibility

Pitfalls

Age as time scale

Odds ratios and hazard ratios are not predictions of risks

Do not blame the metric

Censored data versus competing risks

Disease-specific survival

Overfitting

Data-dependent decisions

Balancing data

Independent predictor

Automated variable selection

How should I prepare for modeling?

Definition of subjects

Choice of time scale

Pre-selection of predictor variables

Preparation of predictor variables

Categorical variables

Continuous variables

Derived predictor variables

Repeated measurements

Measurement error

Missing values

Preparation of event time outcome

Illustration without competing risks

Illustration with competing risks

Artificial censoring at the prediction time horizon

I am ready to build a prediction model

Specifying the model type

Uncensored binary outcome

Right-censored time-to-event outcome (no competing risks)

Right-censored time-to-event outcome with competing risks

Benchmark model

Uncensored binary outcome

Right-censored time-to-event outcome (without competing risks)

Right-censored time-to-event with competing risks

Including predictor variables

Categorical predictor variables

Continuous predictor variables

Interaction effects

Modeling strategy

Variable selection

Conventional model strategy

Whether to use a standard regression model or something else

Advanced topics

How to prevent overfitting the data

How to deal with missing values

How to deal with non-converging models

What you should put in your manuscript

Baseline tables

Follow Up tables

Regression tables

Risk plots

Nomograms

Deployment

Risk charts

Internet calculator

Cost-benefit analysis (waiting lists)

Does my model predict accurately?

Model assessment roadmap

Visualization of the predictions

Calculation of model performance

Visualization of model performance

Uncensored binary outcome

Distribution of the predicted risks

Brier score

AUC

Calibration curves

Right-censored time-to-event outcome (without competing risks)

Distribution of the predicted risks

Brier score with censored data

Time-dependent AUC for censored data

Calibration curve for censored data

Competing risks

Distribution of the predicted risks

Brier score with competing risks

Time-dependent AUC for competing risks

Calibration curve for competing risks

The Index of Prediction Accuracy (IPA)

Choice of prediction time horizon

Time-dependent prediction performance

How do I decide between rival models?

Model comparison roadmap

Analysis of rival prediction models

Uncensored binary outcome

Right-censored time-to-event outcome (without competing risks)

Competing risks

Clinically relevant change of prediction

Does a new marker improve prediction?

Many new predictors

Updating a subject's prediction

What would make me an expert?

Multiple cohorts / Multi-center studies

The role of treatment for making a prediction model

Modeling treatment

Comparative effectiveness tables

Learning curve paradigm

Internal validation (data splitting)

Single split

Calendar split

Multiple splits (cross-validation)

Dilemma of internal validation

The apparent and the + estimator

Tips and tricks

Missing values

Missing values in the learning data

Missing values in the validation data

Time-varying coefficient models

Time-varying predictor variables

Can't the computer just take care of all of this?

Zero layers of cross-validation

What may happen if you do not look at the data

Unsupervised modeling steps

Final model

One layer of cross-validation

Penalized regression

Supervised spline selection

Machine learning (two levels of cross-validation)

Random forest

Deep learning and artificial neural networks

The super learner

Things you might have expected in our book

Threshold selection for decision making

Number of events per variable

Confidence intervals for predicted probabilities

Models developed from case-control data

Hosmer-Lemeshow test

Backward elimination and stepwise selection

Rank correlation (c-index) for survival outcome

Integrated Brier score

Net reclassification index and the integrated discrimination improvement

Re-classification tables

Boxplots of rival models conditional on the outcome

Author(s)

Biography

Thomas A. Gerds is professor at the biostatistics unit at the University of Copenhagen. He is affiliated with the Danish Heart Foundation. He is author of several R-packages on CRAN and has taught statistics courses to non-statisticians for many years.

Michael Kattan is a highly cited author and Chair of the Department of Quantitative Health Sciences at Cleveland Clinic. He is a Fellow of the American Statistical Association and has received two awards from the Society for Medical Decision Making: the Eugene L. Saenger Award for Distinguished Service, and the John M. Eisenberg Award for Practical Application of Medical Decision Making Research.

Critics' Reviews

"Two of the top researchers in the field of clinical prediction models have produced a highly innovative book that brings a very technical topic to public grasp by throwing out the formulas and just talking straight from the heart of practical experience. While clinicians and medical residents can now learn how to build, diagnose and validate risk models themselves, all public health researchers, old and new, will reap the benefits and enjoyment from reading this book."
~Donna Ankerst, Technical University of Munich

Add to Cart