An Introduction to Survival Analysis Using Stata, Revised Third Edition  book cover
4th Edition

An Introduction to Survival Analysis Using Stata, Revised Third Edition

ISBN 9781597181747
Published May 10, 2016 by Stata Press
428 Pages

FREE Standard Shipping
USD $89.95

Prices & shipping based on shipping country


Book Description

An Introduction to Survival Analysis Using Stata, Revised Third Edition is the ideal tutorial for professional data analysts who want to learn survival analysis for the first time or who are well versed in survival analysis but are not as dexterous in using Stata to analyze survival data. This text also serves as a valuable reference to those readers who already have experience using Stata’s survival analysis routines.

The revised third edition has been updated for Stata 14, and it includes a new section on predictive margins and marginal effects, which demonstrates how to obtain and visualize marginal predictions and marginal effects using the margins and marginsplot commands after survival regression models.

Survival analysis is a field of its own that requires specialized data management and analysis procedures. To meet this requirement, Stata provides the st family of commands for organizing and summarizing survival data.

This book provides statistical theory, step-by-step procedures for analyzing survival data, an in-depth usage guide for Stata's most widely used st commands, and a collection of tips for using Stata to analyze survival data and to present the results. This book develops from first principles the statistical concepts unique to survival data and assumes only a knowledge of basic probability and statistics and a working knowledge of Stata.

The first three chapters of the text cover basic theoretical concepts: hazard functions, cumulative hazard functions, and their interpretations; survivor functions; hazard models; and a comparison of nonparametric, semiparametric, and parametric methodologies. Chapter 4 deals with censoring and truncation. The next three chapters cover the formatting, manipulation, stsetting, and error checking involved in preparing survival data for analysis using Stata's st analysis commands. Chapter 8 covers nonparametric methods, including the Kaplan–Meier and Nelson–Aalen estimators and the various nonparametric tests for the equality of survival experience.

Chapters 9–11 discuss Cox regression and include various examples of fitting a Cox model, obtaining predictions, interpreting results, building models, model diagnostics, and regression with survey data. The next four chapters cover parametric models, which are fit using Stata's streg command. These chapters include detailed derivations of all six parametric models currently supported in Stata and methods for determining which model is appropriate, as well as information on stratification, obtaining predictions, and advanced topics such as frailty models. Chapter 16 is devoted to power and sample-size calculations for survival studies. The final chapter covers survival analysis in the presence of competing risks.

Table of Contents

The problem of survival analysis

Parametric modeling 
Semiparametric modeling
Nonparametric analysis 
Linking the three approaches

Describing the distribution of failure times

The survivor and hazard functions
The quantile function
Interpreting the cumulative hazard and hazard rate

Means and medians

Hazard models

Parametric models
Semiparametric models
Analysis time (time at risk)

Censoring and truncation



Recording survival data

The desired format 
Other formats
Example: Wide-form snapshot data

Using stset

A short lesson on dates
Purposes of the stset command
Syntax of the stset command

After stset

Look at stset’s output
List some of your data 
Use stdescribe
Use stvary 
Perhaps use stfill 
Example: Hip-fracture data

Nonparametric analysis

Inadequacies of standard univariate methods 
The Kaplan–Meier estimator

The Nelson–Aalen estimator
Estimating the hazard function
Estimating mean and median survival times
Tests of hypothesis

The Cox proportional hazards model

Using stcox

Likelihood calculations

Stratified analysis

Cox models with shared frailty

Cox models with survey data

Cox model with missing data—multiple imputation

Model building using stcox

Indicator variables
Categorical variables
Continuous variables

Time-varying variables

Modeling group effects: fixed-effects, random-effects, stratification, and clustering

The Cox model: Diagnostics

Testing the proportional-hazards assumption

Residuals and diagnostic measures Reye’s syndrome data

Parametric models

Classes of parametric models

A survey of parametric regression models in Stata

The exponential model

Weibull regression

Gompertz regression (PH metric)
Lognormal regression (AFT metric)
Loglogistic regression (AFT metric)
Generalized gamma regression (AFT metric)
Choosing among parametric models

Postestimation commands for parametric models

Use of predict after streg

Using stcurve
Predictive margins and marginal effects

Generalizing the parametric regression model

Frailty models

Power and sample-size determination for survival analysis

Estimating sample size

Accounting for withdrawal and accrual of subjects 

Estimating power and effect size 
Tabulating or graphing results

Competing risks

Cause-specific hazards
Cumulative incidence functions
Nonparametric analysis

Semiparametric analysis

Parametric analysis

View More



Mario Cleves is Professor and the Biostatistics Section Chief in the Department of Pediatrics at the University of Arkansas for Medical Sciences.

William Gould is the president and head of development at StataCorp.

Yulia Marchenko is a senior statistician at StataCorp.

All are authors of Stata statistical software, in particular, Stata’s widely used survival analysis suite.


"This is an application-oriented introduction to survival analysis using Stata. The authors have focused on intuitions without getting into technical details. For example … the rather mysterious partial likelihood was elegantly illustrated with a small dataset and simple derivations for conditional probabilities. The book provides an excellent coverage of commonly used nonparametric, semiparametric, and parametric analyses of survival data, with ample application examples. The implementation of each survival approach has been carefully laid out in Stata syntax and real data analyses. Moreover, the material covered in the book is surprisingly comprehensive, including Coxmodels with time-varying covariates, shared frailty models, multiple imputations, and competing risk regression. Those topics are often encountered in practice but usually missing from an introductory book of survival analysis. The revised third edition has been updated to reflect the welcome additions in Stata 14 relative to previous versions. … The revised third edition provides not only an excellent tutorial to anyone who is interested in learning survival models with examples, but also an extremely handy reference to researchers who would like to perform survival analyses in Stata."
—Yu Cheng, University of Pittsburgh, in The American Statistician, April 2018