4th Edition

An Introduction to Survival Analysis Using Stata, Revised Third Edition

    428 Pages
    by Stata Press

    An Introduction to Survival Analysis Using Stata, Revised Third Edition is the ideal tutorial for professional data analysts who want to learn survival analysis for the first time or who are well versed in survival analysis but are not as dexterous in using Stata to analyze survival data. This text also serves as a valuable reference to those readers who already have experience using Stata’s survival analysis routines.

    The revised third edition has been updated for Stata 14, and it includes a new section on predictive margins and marginal effects, which demonstrates how to obtain and visualize marginal predictions and marginal effects using the margins and marginsplot commands after survival regression models.

    Survival analysis is a field of its own that requires specialized data management and analysis procedures. To meet this requirement, Stata provides the st family of commands for organizing and summarizing survival data.

    This book provides statistical theory, step-by-step procedures for analyzing survival data, an in-depth usage guide for Stata's most widely used st commands, and a collection of tips for using Stata to analyze survival data and to present the results. This book develops from first principles the statistical concepts unique to survival data and assumes only a knowledge of basic probability and statistics and a working knowledge of Stata.

    The first three chapters of the text cover basic theoretical concepts: hazard functions, cumulative hazard functions, and their interpretations; survivor functions; hazard models; and a comparison of nonparametric, semiparametric, and parametric methodologies. Chapter 4 deals with censoring and truncation. The next three chapters cover the formatting, manipulation, stsetting, and error checking involved in preparing survival data for analysis using Stata's st analysis commands. Chapter 8 covers nonparametric methods, including the Kaplan–Meier and Nelson–Aalen estimators and the various nonparametric tests for the equality of survival experience.

    Chapters 9–11 discuss Cox regression and include various examples of fitting a Cox model, obtaining predictions, interpreting results, building models, model diagnostics, and regression with survey data. The next four chapters cover parametric models, which are fit using Stata's streg command. These chapters include detailed derivations of all six parametric models currently supported in Stata and methods for determining which model is appropriate, as well as information on stratification, obtaining predictions, and advanced topics such as frailty models. Chapter 16 is devoted to power and sample-size calculations for survival studies. The final chapter covers survival analysis in the presence of competing risks.

    The problem of survival analysis

    Parametric modeling 
    Semiparametric modeling
    Nonparametric analysis 
    Linking the three approaches

    Describing the distribution of failure times

    The survivor and hazard functions
    The quantile function
    Interpreting the cumulative hazard and hazard rate

    Means and medians

    Hazard models

    Parametric models
    Semiparametric models
    Analysis time (time at risk)

    Censoring and truncation

    Censoring

    Truncation

    Recording survival data

    The desired format 
    Other formats
    Example: Wide-form snapshot data

    Using stset

    A short lesson on dates
    Purposes of the stset command
    Syntax of the stset command

    After stset

    Look at stset’s output
    List some of your data 
    Use stdescribe
    Use stvary 
    Perhaps use stfill 
    Example: Hip-fracture data

    Nonparametric analysis

    Inadequacies of standard univariate methods 
    The Kaplan–Meier estimator

    The Nelson–Aalen estimator
    Estimating the hazard function
    Estimating mean and median survival times
    Tests of hypothesis

    The Cox proportional hazards model

    Using stcox

    Likelihood calculations

    Stratified analysis

    Cox models with shared frailty

    Cox models with survey data

    Cox model with missing data—multiple imputation

    Model building using stcox

    Indicator variables
    Categorical variables
    Continuous variables

    Interactions
    Time-varying variables

    Modeling group effects: fixed-effects, random-effects, stratification, and clustering

    The Cox model: Diagnostics

    Testing the proportional-hazards assumption

    Residuals and diagnostic measures Reye’s syndrome data

    Parametric models

    Motivation
    Classes of parametric models

    A survey of parametric regression models in Stata

    The exponential model

    Weibull regression

    Gompertz regression (PH metric)
    Lognormal regression (AFT metric)
    Loglogistic regression (AFT metric)
    Generalized gamma regression (AFT metric)
    Choosing among parametric models

    Postestimation commands for parametric models

    Use of predict after streg

    Using stcurve
    Predictive margins and marginal effects

    Generalizing the parametric regression model

    Frailty models

    Power and sample-size determination for survival analysis

    Estimating sample size

    Accounting for withdrawal and accrual of subjects 

    Estimating power and effect size 
    Tabulating or graphing results

    Competing risks

    Cause-specific hazards
    Cumulative incidence functions
    Nonparametric analysis

    Semiparametric analysis

    Parametric analysis

    Biography

    Mario Cleves is Professor and the Biostatistics Section Chief in the Department of Pediatrics at the University of Arkansas for Medical Sciences.

    William Gould is the president and head of development at StataCorp.

    Yulia Marchenko is a senior statistician at StataCorp.

    All are authors of Stata statistical software, in particular, Stata’s widely used survival analysis suite.

    "This is an application-oriented introduction to survival analysis using Stata. The authors have focused on intuitions without getting into technical details. For example … the rather mysterious partial likelihood was elegantly illustrated with a small dataset and simple derivations for conditional probabilities. The book provides an excellent coverage of commonly used nonparametric, semiparametric, and parametric analyses of survival data, with ample application examples. The implementation of each survival approach has been carefully laid out in Stata syntax and real data analyses. Moreover, the material covered in the book is surprisingly comprehensive, including Coxmodels with time-varying covariates, shared frailty models, multiple imputations, and competing risk regression. Those topics are often encountered in practice but usually missing from an introductory book of survival analysis. The revised third edition has been updated to reflect the welcome additions in Stata 14 relative to previous versions. … The revised third edition provides not only an excellent tutorial to anyone who is interested in learning survival models with examples, but also an extremely handy reference to researchers who would like to perform survival analyses in Stata."
    —Yu Cheng, University of Pittsburgh, in The American Statistician, April 2018