2nd Edition

Using R and RStudio for Data Management, Statistical Analysis, and Graphics

ISBN 9781482237368
Published March 10, 2015 by Chapman and Hall/CRC
313 Pages - 50 B/W Illustrations

USD $82.95

Prices & shipping based on shipping country


Book Description

Improve Your Analytical Skills

Incorporating the latest R packages as well as new case studies and applications, Using R and RStudio for Data Management, Statistical Analysis, and Graphics, Second Edition covers the aspects of R most often used by statistical analysts. New users of R will find the book’s simple approach easy to understand while more sophisticated users will appreciate the invaluable source of task-oriented information.

New to the Second Edition

  • The use of RStudio, which increases the productivity of R users and helps users avoid error-prone cut-and-paste workflows
  • New chapter of case studies illustrating examples of useful data management tasks, reading complex files, making and annotating maps, "scraping" data from the web, mining text files, and generating dynamic graphics
  • New chapter on special topics that describes key features, such as processing by group, and explores important areas of statistics, including Bayesian methods, propensity scores, and bootstrapping
  • New chapter on simulation that includes examples of data generated from complex models and distributions
  • A detailed discussion of the philosophy and use of the knitr and markdown packages for R
  • New packages that extend the functionality of R and facilitate sophisticated analyses
  • Reorganized and enhanced chapters on data input and output, data management, statistical and mathematical functions, programming, high-level graphics plots, and the customization of plots

Easily Find Your Desired Task

Conveniently organized by short, clear descriptive entries, this edition continues to show users how to easily perform an analytical task in R. Users can quickly find and implement the material they need through the extensive indexing, cross-referencing, and worked examples in the text. Datasets and code are available for download on a supplementary website.

Table of Contents

Data Input and Output
Further resources

Data Management
Structure and metadata
Derived variables and data manipulation
Merging, combining, and subsetting datasets
Date and time variables
Further resources

Statistical and Mathematical Functions
Probability distributions and random number generation
Mathematical functions
Matrix operations

Programming and Operating System Interface
Control flow, programming, and data generation
Interactions with the operating system

Common Statistical Procedures
Summary statistics
Bivariate statistics
Contingency tables
Tests for continuous variables
Analytic power and sample size calculations
Further resources

Linear Regression and ANOVA
Model fitting
Tests, contrasts, and linear functions of parameters
Model results and diagnostics
Model parameters and results
Further resources

Regression Generalizations and Modeling
Generalized linear models
Further generalizations
Robust methods
Models for correlated data
Survival analysis
Multivariate statistics and discriminant procedures
Complex survey design
Model selection and assessment
Further resources

A Graphical Compendium
Univariate plots
Univariate plots by grouping variable
Bivariate plots
Multivariate plots
Special-purpose plots
Further resources

Graphical Options and Configuration
Adding elements
Options and parameters
Saving graphs

Generating data
Simulation applications
Further resources

Special Topics
Processing by group
Simulation-based power calculations
Reproducible analysis and output
Advanced statistical methods
Further resources

Case Studies
Data management and related tasks
Read variable format files
Plotting maps
Data scraping
Text mining
Interactive visualization
Manipulating bigger datasets
Constrained optimization: the knapsack problem

Appendix A: Introduction to R and RStudio
Appendix B: The HELP Study Dataset
Appendix C: References
Appendix D: Indices

View More



Nicholas J. Horton is a professor of statistics at Amherst College. His research interests include longitudinal regression models and missing data methods, with applications in psychiatric epidemiology and substance abuse research.

Ken Kleinman is an associate professor in the Department of Population Medicine at Harvard Medical School. His research deals with clustered data analysis, surveillance, and epidemiological applications in projects ranging from vaccine and bioterrorism surveillance to observational epidemiology to individual-, practice-, and community-randomized interventions.

Featured Author Profiles

Author - Nicholas J Horton

Nicholas J Horton

Professor of Statistics, Amherst College
Amherst, MA, USA

Learn more about Nicholas J Horton »


"The second edition of the book preserves the many good points of the first, and makes some improvements to the structure, e.g., on the graphical compendium. It also contains added material on more recent possibilities…is a good buy, if the goal is to have a reference book which allows to quickly find a way of accomplishing a task at hand in R, be it with or without RStudio."
— Ulrike Grömping, Beuth University of Applied Sciences Berlin, Journal of Statistical Software, November 2015

"… the book is easy to use. I have had it on my desk for the past few weeks and it has become invaluable. For those, like me, who find themselves regularly switching between R, MATLAB, and Python—or similar packages—it can save a lot of time."
Significance Magazine, February 2016

Praise for the First Edition:
This book is an excellent reference resource. Used this way, it can be helpful for years to come for both experienced and novice users. The organization of the material makes it easy to find the relevant piece of information either by topic (from the table of contents) or using one of the indexes. The task entries are self-contained. Users with experience in technical computing may use it as a quick starter in R, as well.
—Georgi N. Boshnakov, Journal of Applied Statistics, June 2012

This book provides a concise reference and annotated examples for R … . It is needed because R does not come with a coordinated manual … It is much easier to find information in Horton and Kleinman’s book because of their more detailed indices and table of contents. … Horton and Kleinman have succeeded very well in their goal of providing a concise reference manual and annotated examples. If you know the statistics (or can look them up) and have some experience using R, it is an extremely useful reference, and it has become my most consulted R book. … it would be an excellent reference for those wanting look up the syntax of a command together with an example of how to use it. It is also very useful if you cannot remember the command and want to know how to do it in R.
—Paul H. Geissler, The American Statistician, November 2011

The interesting aspect of the book is that it does not only describe the basic statistics and graphics function of the basic R system but it describes the use of 40 additional available from the CRAN website. The website contains also the R code to install all the packages that contain the described features. In summary, the book is a useful complement to introductory statistics books and lectures … Those who know R might get additional hints on new features of statistical analyses.
International Statistical Review (2011), 79