2nd Edition

SAS and R Data Management, Statistical Analysis, and Graphics, Second Edition

By Ken Kleinman, Nicholas J. Horton Copyright 2014
    470 Pages 48 B/W Illustrations
    by Chapman & Hall

    An Up-to-Date, All-in-One Resource for Using SAS and R to Perform Frequent Tasks
    The first edition of this popular guide provided a path between SAS and R using an easy-to-understand, dictionary-like approach. Retaining the same accessible format, SAS and R: Data Management, Statistical Analysis, and Graphics, Second Edition explains how to easily perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, and graphics, along with more complex applications.

    New to the Second Edition
    This edition now covers RStudio, a powerful and easy-to-use interface for R. It incorporates a number of additional topics, including using application program interfaces (APIs), accessing data through database management systems, using reproducible analysis tools, and statistical analysis with Markov chain Monte Carlo (MCMC) methods and finite mixture models. It also includes extended examples of simulations and many new examples.

    Enables Easy Mobility between the Two Systems
    Through the extensive indexing and cross-referencing, users can directly find and implement the material they need. SAS users can look up tasks in the SAS index and then find the associated R code while R users can benefit from the R index in a similar manner. Numerous example analyses demonstrate the code in action and facilitate further exploration. The datasets and code are available for download on the book’s website.

    Data Input and Output

    Data Management
    Structure and Meta-Data
    Derived Variables and Data Manipulation
    Merging, Combining, and Subsetting Datasets
    Date and Time Variables

    Statistical and Mathematical Functions
    Probability Distributions and Random Number Generation
    Mathematical Functions
    Matrix Operations

    Programming and Operating System Interface
    Control Flow, Programming, and Data Generation
    Functions and Macros
    Interactions with the Operating System

    Common Statistical Procedures
    Summary Statistics
    Bivariate Statistics
    Contingency Tables
    Tests for Continuous Variables
    Analytic Power and Sample Size Calculations

    Linear Regression and ANOVA
    Model Fitting
    Tests, Contrasts, and Linear Functions of Parameters
    Model Diagnostics
    Model Parameters and Results

    Regression Generalizations and Modeling
    Generalized Linear Models
    Further Generalizations
    Robust Methods
    Models for Correlated Data
    Survival Analysis
    Multivariate Statistics and Discriminant Procedures
    Complex Survey Design
    Model Selection and Assessment

    A Graphical Compendium
    Univariate Plots
    Univariate Plots by Grouping Variable
    Bivariate Plots
    Multivariate Plots
    Special Purpose Plots

    Graphical Options and Configuration
    Adding Elements
    Options and Parameters
    Saving Graphs

    Generating Data
    Simulation Applications

    Special Topics
    Processing by Group
    Simulation-Based Power Calculations
    Reproducible Analysis and Output
    Advanced Statistical Methods

    Case Studies
    Data Management and Related Tasks
    Read Variable Format Files
    Plotting Maps
    Data Scraping and Visualization
    Manipulating Bigger Datasets
    Constrained Optimization: The Knapsack Problem

    Appendix A: Introduction to SAS
    Running SAS and a Sample Session
    Learning SAS and Getting Help
    Fundamental Elements of SAS Syntax
    Work Process: The Cognitive Style of SAS
    Useful SAS Background
    Output Delivery System
    SAS Macro Variables

    Appendix B: Introduction to R and RStudio
    Running R and Sample Session
    Learning R and Getting Help
    Fundamental Structures and Objects
    Add-ons: Packages
    Support and Bugs

    Appendix C: The HELP Study Dataset
    Background on the HELP Study
    Roadmap to Analyses of the HELP Dataset
    Detailed Description of the Dataset

    Appendix D: References

    Appendix E: Indices
    Subject Index
    SAS Index
    R Index

    Further Resources and Examples appear at the end of most chapters.


    Ken Kleinman, Nicholas J. Horton

    "This book is not only an excellent cross-reference for SAS or R users to find the corresponding code in the opposing language, but also a useful resource for readers to learn statistical programming in both systems. The book is organized into 12 chapters covering a wide range of programming and statistical topics, with both SAS and R code presented for all tasks. … This book is a great resource for users who have a long experience in only one system and need to use the other system. The SAS index at the end of the book is particularly of help for SAS users to look up a task for which they know the SAS code and turn to a page with that SAS code as well as the associated R code. And the R index in the book is used the same way by R users to find the corresponding SAS code for a task."
    —Xulei Liu, Vanderbilt University, in Biometrics, December 2017

    "The second edition of SAS and R: Data Management, Statistical Analysis, and Graphics has several updates from the first, most notably the addition of three new sections, and the inclusion of R-Studio, which is a more user-friendly version of R. The first new section covers simulating data, the second covers several special topics, such as Bayesian methods and bootstrapping, and the third explores some case studies… This book is not intended to be read cover to cover, but rather as a dictionary of how to do things in both SAS and R. It covers a wide range of topics, including data management, numerical and graphical descriptive summaries, common statistical procedures, regression analyses, and regression generalizations… If you know either SAS or R, but not both, and are looking for a quick reference for common statistical tasks to be performed in the language that you are not familiar with, you will find this book helpful."
    —Joshua Landon, The George Washington University, in The American Statistician, August 2016

    Praise for the First Edition:
    "By placing the R and SAS solutions together and by covering a vast array of tasks in one book, Kleinman and Horton have added surprising value and searchability to the information in their book. … a home run, and it is a book I am grateful to have sitting, dust-free, on my shelf."
    —Robert Alan Greevy, Jr, Teaching of Statistics in the Health Sciences, Spring 2013

    "Excellent cross-referencing to other topics and end-of-chapter worked examples on the ‘Health evaluation and linkage to primary care’ data set are given with each topic. … users who are proficient in either of the software packages but with the need to use the other will find this book useful."
    —Frances Denny, Journal of the Royal Statistical Society, Series A, 2012

    "This book provides a very useful bridge between the two packages … . A wide range of procedures are covered and the code, which is generally well explained, is available for download from their website. … this is a very useful book for SAS and R users alike with an excellent overview of a wide range of data management options, statistical analyses and graphics. … full of useful tips and tricks."
    —Robin Turner, Statistics in Medicine, 2012

    "It is clearly written and code is appropriately highlighted to facilitate readability. … it is a potentially useful reference material for experienced users of one of the two systems, who need to quickly find how to perform a familiar task in the alternative system."
    Biometrics, 67, September 2011

    "It is an excellent text that is designed to translate SAS to R. … For statisticians with knowledge of both SAS and R programming, this book provides a useful resource to understand the differences between SAS and R codes and can be used for browsing and for finding particular SAS and R functions to perform common tasks. The book will strengthen the analytical abilities of relatively new users of either system by providing them with a concise reference manual and annotated examples executed in both packages. Professional analysts as well as statisticians, epidemiologists and others who are engaged in research or data analysis will find this book very useful. The book is comprehensive and covers an extensive list of statistical techniques from data management to graphics procedures, cross-referencing, indexing and good worked examples in SAS and R at the end of each chapter."
    Significance, July 2011

    "As the authors point out in the Introduction, the book functions like an English–French dictionary. The material is organized by task. By looking up a particular task you wish to perform, R and SAS code are presented and briefly explained. … It is easy to find the section in the text which gives several ways to do this in both SAS and R. … Because the authors often present alternative ways to do a task, this book can be a great source of diverse and elegant solutions even to experienced users. Each task is cross-referenced to other tasks. … The book has a comprehensive website containing the code, datasets, a FAQ, blog, and errata list with a link to report new errors. … The end of the book is very useful, where there are good introductions to SAS and R, as well as separate subject, SAS, and R indices. These indices are invaluable for finding a topic when you are unsure of exactly how to phrase it. … there is great breadth and scope of the material in this book. … If you use both SAS and R on a regular basis, get this book. If you know one of the packages and are learning the other … get this book, too."
    —Charles E. Heckler, Technometrics, May 2011

    "… a convenient reference text to quickly learn by example how to perform common tasks in both software packages. … the book provides a powerful starting point to a wide variety of statistical techniques available in SAS and R. … it facilitates a translation between SAS and R, without getting overly detailed or technical. It is mainly useful as a starting point for those who already know either R or SAS, and want to learn the other language, without going over extensive manuals or introductory texts."
    Journal of Statistical Software, January 2011, Volume 37