An Introduction to Survival Analysis Using Stata, Third Edition

By Mario Cleves, William Gould, Roberto Gutierrez, Yulia Marchenko

© 2010 – Stata Press

412 pages

Purchasing Options:
Paperback: 9781597180740
pub: 2010-09-09
Currently out of stock
US Dollars$79.95

About the Book

An Introduction to Survival Analysis Using Stata, Third Edition provides the foundation to understand various approaches for analyzing time-to-event data. It is not only a tutorial for learning survival analysis but also a valuable reference for using Stata to analyze survival data. Although the book assumes knowledge of statistical principles, simple probability, and basic Stata, it takes a practical, rather than mathematical, approach to the subject.

This updated third edition highlights new features of Stata 11, including competing-risks analysis and the treatment of missing values via multiple imputation. Other additions include new diagnostic measures after Cox regression, Stata’s new treatment of categorical variables and interactions, and a new syntax for obtaining prediction and diagnostics after Cox regression.

After reading this book, you will understand the formulas and gain intuition about how various survival analysis estimators work and what information they exploit. You will also acquire deeper, more comprehensive knowledge of the syntax, features, and underpinnings of Stata’s survival analysis routines.


Praise for the Second Edition

Unlike some glorified manuals available in the market, this book is a genuine text for an introductory course in survival analysis using Stata. This book is also an excellent supplement for a graduate-level survival analysis course as well as a reference book for a data analyst dealing with survival data. The book presents the essential models, formulas, background, and relevant references in a compact and adequate manner, and then continues to present the relevant tools, their implementation, and explanation of outputs. …

The American Statistician, November 2010, Vol. 64, No. 4

Table of Contents

The Problem of Survival Analysis

Parametric modeling

Semiparametric modeling

Nonparametric analysis

Linking the three approaches

Describing the Distribution of Failure Times

The survivor and hazard functions

The quantile function

Interpreting the cumulative hazard and hazard rate

Means and medians

Hazard Models

Parametric models

Semiparametric models

Analysis time (time at risk)

Censoring and Truncation



Recording Survival Data

The desired format

Other formats

Example: Wide-form snapshot data

Using stset

A short lesson on dates

Purposes of the stset command

Syntax of the stset command

After stset

Look at stset’s output

List some of your data

Use stdescribe

Use stvary

Perhaps use stfill

Example: Hip fracture data

Nonparametric Analysis

Inadequacies of standard univariate methods

The Kaplan–Meier estimator

The Nelson–Aalen estimator

Estimating the hazard function

Estimating mean and median survival times

Tests of hypothesis

The Cox Proportional Hazards Model

Using stcox

Likelihood calculations

Stratified analysis

Cox models with shared frailty

Cox models with survey data

Cox model with missing data–multiple imputation

Model Building Using stcox

Indicator variables

Categorical variables

Continuous variables


Time-varying variables

Modeling group effects: fixed-effects, random-effects, stratification, and clustering

The Cox Model: Diagnostics

Testing the proportional-hazards assumption

Residuals and diagnostic measures

Parametric Models


Classes of parametric models

A Survey of Parametric Regression Models in Stata

The exponential model

Weibull regression

Gompertz regression (PH metric)

Lognormal regression (AFT metric)

Loglogistic regression (AFT metric)

Generalized gamma regression (AFT metric)

Choosing among parametric models

Postestimation Commands for Parametric Models

Use of predict after streg

Using stcurve

Generalizing the Parametric Regression Model

Using the ancillary() option

Stratified models

Frailty models

Power and Sample-Size Determination for Survival Analysis

Estimating sample size

Accounting for withdrawal and accrual of subjects

Estimating power and effect size

Tabulating or graphing results

Competing Risks

Cause-specific hazards

Cumulative incidence functions

Nonparametric analysis

Semiparametric analysis

Parametric analysis


Author Index

Subject Index

About the Authors

Mario Cleves is a professor of pediatrics at the University of Arkansas for Medical Sciences and a senior biostatistician at the Arkansas Center for Birth Defects Research and Prevention.

William Gould is the president and head of development at StataCorp.

Roberto Gutierrez is the director of statistics at StataCorp.

Yulia Marchenko is a senior statistician at StataCorp.

All are authors of Stata statistical software, in particular, Stata’s widely used survival analysis suite.

Subject Categories

BISAC Subject Codes/Headings:
MATHEMATICS / Probability & Statistics / General