# Clinical Trial Data Analysis Using R and SAS

## Preview

## Book Description

Review of the First Edition

*"The goal of this book, as stated by the authors, is to fill the knowledge gap that exists between developed statistical methods and the applications of these methods. Overall, this book achieves the goal successfully and does a nice job. I would highly recommend it …The example-based approach is easy to follow and makes the book a very helpful desktop reference for many biostatistics methods."***— Journal of Statistical Software**

**Clinical Trial Data Analysis Using R and SAS, Second Edition** provides a thorough presentation of biostatistical analyses of clinical trial data with step-by-step implementations using R and SAS. The book’s practical, detailed approach draws on the authors’ 30 years’ experience in biostatistical research and clinical development. The authors develop step-by-step analysis code using appropriate R packages and functions and SAS PROCS, which enables readers to gain an understanding of the analysis methods and R and SAS implementation so that they can use these two popular software packages to analyze their own clinical trial data.

What’s New in the Second Edition

- Adds SAS programs along with the R programs for clinical trial data analysis.
- Updates all the statistical analysis with updated R packages.
- Includes correlated data analysis with multivariate analysis of variance.
- Applies R and SAS to clinical trial data from hypertension, duodenal ulcer, beta blockers, familial andenomatous polyposis, and breast cancer trials.
- Covers the biostatistical aspects of various clinical trials, including treatment comparisons, time-to-event endpoints, longitudinal clinical trials, and bioequivalence trials.

## Table of Contents

**Introduction to** **R **

What is R?

Steps on Installing R and Updating R Packages

First Step: Install R Base System

Second Step: Installing and Updating R Packages

Steps to Get Help and Documentation

R for Clinical Trials

A Simple Simulated Clinical Trial

Data Simulation

R Functions

Data Generation and Manipulation

Basic R Graphics

Data Analysis

Summary and Recommendations for Further Reading

Appendix: SAS Programs

Overview of Clinical Trials

Introduction

Phases of Clinical Trials and Objectives

Phase 0 Trials

Phase I Trials

Phase II Trials

Phase III Trials

Phase IV Trials

The Clinical Development Plan

Biostatistical Aspects of a Protocol

Background or Rationale

Objective

Plan of Study

Study Population

Study Design

Problem Management

Statistical Analysis Section

Study Objectives as Statistical Hypotheses

Endpoints

Statistical Methods

Statistical Monitoring Procedures

Statistical Design Considerations

Subset Analyses

Concluding Remarks

Treatment Comparisons in Clinical Trials

Data from Clinical Trials

Diastolic Blood Pressure

Clinical Trial on Duodenal Ulcer Healing

Statistical Models for Treatment Comparisons

Models for Continuous Endpoints

Student's t-Tests

One-Way Analysis of Variance(ANOVA)

Multi-Way ANOVA: Factorial Design

Multivariate Analysis of Variance (MANOVA)

Models for Categorical Endpoints: Pearson's _2-test

Data Analysis in R

Analysis of the DBP Trial

Preliminary Data analysis

t-test

Bootstrapping Method

One-Way ANOVA for Time Changes

Two-Way ANOVA for Interaction

MANOVA for Treatment Difference

Analysis of Duodenal Ulcer Healing Trial

Using Pearson's _2-test

Using Contingency Table

Summary and Conclusions

Appendix: SAS Programs

Treatment Comparisons in Clinical Trials with Covariates

Data from Clinical Trials

Diastolic Blood Pressure

Clinical Trials for Beta-Blockers

Clinical Trial on Familial Adenomatous Polyposis

Statistical Models Incorporating Covariates

ANCOVA Models for Continuous Endpoints

Logistic Regression for Binary/Binomial Endpoints

Poisson Regression for Clinical Endpoint with Counts

Overdispersion

Data Analysis in R

Analysis of DBP Trial

Analysis of Baseline Data

ANCOVA of DBP Change from Baseline

MANCOVA for DBP Change from Baseline

Analysis of Beta-Blocker Trial

Analysis of Data from Familial Adenomatous Polyposis Trial

Summary and Conclusions

Appendix: SAS Programs

Analysis of Clinical Trials with Time-to-Event Endpoints

Clinical Trials with Time-to-Event Data

Phase II Trial of Patients with Stage-2 Breast Carcinoma

Breast Cancer Trial with Interval-Censored Data

Statistical Models

Primary Functions and Definitions

The Hazard Function

The Survival Function

The Death Density Function

Relationships between These Functions

Parametric Models

The Exponential Model

The Weibull Model

The Rayleigh Model

The Gompertz Model

The Lognormal Model

Statistical Methods for Right-Censored Data

Nonparametric Models: Kaplan-Meier Estimator

Cox Proportion Hazards Regression

Statistical Methods for Interval-Censored Data

Turnbull's Nonparametric Estimator

Parametric Likelihood Estimation with Covariates

Semiparametric Estimation: the IntCox

Step-by-Step Implementations in R

Stage-2 Breast Carcinoma

Fit Kaplan-Meier

Fit Weibull Parametric Model

Fit Cox Regression Model

Breast Cancer with Interval-Censored Data

Fit Turnbull's Nonparametric Estimator

Fit Turnbull's Nonparametric Estimator Using

R Package interval

Fitting Parametric Models

Testing Treatment Effect Using Semiparametric Estimation: IntCox

Testing Treatment Effect Using Semiparametric Estimation: ictest

Summary and Discussions

Appendix: SAS Programs

Longitudinal Data Analysis for Clinical Trials

Clinical Trials

Diastolic Blood Pressure

Clinical Trial on Duodenal Ulcer Healing

Statistical Models

Linear Mixed Models

Generalized Linear Mixed Models

Generalized Estimating Equation

Longitudinal Data Analysis for Clinical Trials

Analysis of Diastolic Blood Pressure Data

Data Graphics and Response Feature Analysis

Longitudinal Modeling

Analysis of Cimetidine Duodenal Ulcer Trial

Preliminary Analysis

Fit Logistic Regression to Binomial Data

Fit Generalized Linear Mixed Model

Fit GEE

Summary and Discussion

Appendix: SAS Programs

Sample Size Determination and Power Calculations in Clinical Trials

Prerequisites for Sample Size Determination

Comparison of Two Treatment Groups with Continuous Endpoints

Fundamentals

Basic Formula for Sample Size Calculation

R Function power.t.test

Unequal Variance: samplesize Package

Two Binomial Proportions

R Function power.prop.test

R Library: pwr

R Function nBinomial in gsDesign library

Time-to-Event Endpoint

Design of Group Sequential Trials

Introduction

gsDesign

Longitudinal Trials

Longitudinal Trial with Continuous Endpoint

The Model Setting

Sample Size Calculations

Power Calculation

Example and R Illustration

Longitudinal Binary Endpoint

Approximate Sample Size Calculation

Example and R Implementation

Relative Changes and Coefficient of Variation

Introduction

Sample Size Calculation Formula

Example and R Implementation

Concluding Remarks

Appendix: SAS Programs

**Meta-Analysis of Clinical Trials **

Data from Clinical Trials

Clinical Trials for Beta-Blockers: Binary Data

Data for Cochrane Collaboration Logo: Binary Data

Clinical Trials on Amlodipine: Continuous Data

Statistical Models for Meta-Analysis

Clinical Hypotheses and Effect Size

Fixed-Effects Meta-Analysis Model: The Weighted-Average

Random- Effects Meta-Analysis Model: DerSimonian-Laird

Publication Bias

Data Analysis in R

Analysis of Beta-Blocker Trials

Fitting the Fixed- Effects Model

Fitting the Random- Effects Model

Meta-Analysis for Cochrane Collaboration Logo

Analysis of Amlodipine Trial Data

Load the Library and Data

Fit the Fixed- Effects Model

Fit the Random- Effects Model

Summary and Conclusions

Appendix: SAS Programs

Bayesian Methods in Clinical Trials

Bayesian Models

Bayes' Theorem

Posterior Distributions for Some Standard Distributions

Normal Distribution with Known Variance

Normal Distribution with Unknown Variance

Normal Regression

Binomial Distribution

Multinomial Distribution

Simulation from the Posterior Distribution

Direct Simulation

Importance Sampling

Gibbs Sampling

Metropolis-Hastings Algorithm

R Packages in Bayesian Modeling

Introduction

R Packages using WinBUGS

R2WinBUGS

BRugs

rbugs

Typical Usage

MCMCpack

MCMC Simulations

Normal-Normal Model

Beta-Binomial Model

Bayesian Data Analysis

Blood Pressure Data: Bayesian Linear Regression

Binomial Data: Bayesian Logistic Regression

Count Data: Bayesian Poisson Regression

Comparing Two Treatments

Summary and Discussion

Appendix: SAS

Bioequivalence Clinical Trials

Data from Bioequivalence Clinical Trials

Data from Chow and Liu (2009)

Bioequivalence Trial on Cimetidine Tablets

Bioequivalence Clinical Trial Endpoints

Statistical Methods to Analyze Bioequivalence

Decision CIs for Bioequivalence

The Classical Asymmetric Confidence Interval

Westlake's Symmetric Confidence Interval

Two One-Sided Tests

Bayesian

Individual-Based Bienayme-Tchebyche_(BT) Inequality CI

Individual-Based Bootstrap CIs

Step-by-Step Implementation in R

Analyze the data from Chow and Liu (2009)

Load the data into R

Tests for Carryover Effect

Test for Direct Formulation Effect

Analysis of Variance

Decision CIs

Classical Shortest 90% CI

The Westlake CI

Two One-sided Tests

Bayesian Approach

Individual-based BT CI

Bootstrap CIs

Analyze the data from Cimetidine Trial

Clinical Trial Endpoints Calculations

ANOVA: Tests for Carryover and Other Effects

Decision CIs

Classical Shortest 90% CI

The Westlake CI

Two One-sided CI

Bayesian Approach

Individual-based BT CI

Bootstrap CIs

Summary and Conclusions

Appendix: SAS Program

Adverse Events in Clinical Trials

Adverse Event Data from a Clinical Trial

Statistical Methods

Confidence Interval (CI) Methods

Comparison using Direct CI Method

Comparison using Indirect CI Methods

Significance Level Methods (SLM)

SLM using normal approximation

SLM using exact binomial distribution

SLM using resampling from pooled samples

SLM using resampling from pooled AE rates

Step-by-Step Implementation in R

Clinical Trial Data Manipulation

R Implementations for CI Methods

R Implementations for Indirect CI Methods

R for Significant Level Methods

R for SLM with normal approximation

R for SLM with exact binomial

R for SLM using Sampling-Resampling

Summary and Discussions

Appendix: SAS Programs

Analysis of DNA Microarrays in Clinical Trials

DNA Microarray

Introduction

DNA, RNA and Genes

Central Dogma of Molecular Biology

Probes, Probesets, Mismatch and Perfectmatch

Microarray and Statistical Analysis

Software: R/Bioconductor

Breast Cancer Data

Data Source

Low-Level Data Analysis

Introduction

Library affy

Quality Control

Background, Normalization and Summarization

High-Level Analysis

Statistical t-test

Model Fitting

Number of Significantly Expressed Genes

Functional Analysis of Gene Lists

Concluding Remarks

Appendix: SAS Programs

Index

### Featured Author Profiles

## Reviews

" . . . this book provides a very useful overview of the statistical methods used in the analysis of clinical trials, along with their implementations. This will particularly help clinical practitioners to apply these methodologies in their own scientific problems . . . I would really like to thank the authors, D. Chen, K. E. Peace and P. Zhang, for such a nice readymade reference for clinical trial analysis, with very interesting real data illustrations."

~Abhik Ghosh,International Society for Clinical Biostatistics