Improve Your Analytical Skills
Incorporating the latest R packages as well as new case studies and applications, Using R and RStudio for Data Management, Statistical Analysis, and Graphics, Second Edition covers the aspects of R most often used by statistical analysts. New users of R will find the book’s simple approach easy to understand while more sophisticated users will appreciate the invaluable source of task-oriented information.
New to the Second Edition
- The use of RStudio, which increases the productivity of R users and helps users avoid error-prone cut-and-paste workflows
- New chapter of case studies illustrating examples of useful data management tasks, reading complex files, making and annotating maps, "scraping" data from the web, mining text files, and generating dynamic graphics
- New chapter on special topics that describes key features, such as processing by group, and explores important areas of statistics, including Bayesian methods, propensity scores, and bootstrapping
- New chapter on simulation that includes examples of data generated from complex models and distributions
- A detailed discussion of the philosophy and use of the knitr and markdown packages for R
- New packages that extend the functionality of R and facilitate sophisticated analyses
- Reorganized and enhanced chapters on data input and output, data management, statistical and mathematical functions, programming, high-level graphics plots, and the customization of plots
Easily Find Your Desired Task
Conveniently organized by short, clear descriptive entries, this edition continues to show users how to easily perform an analytical task in R. Users can quickly find and implement the material they need through the extensive indexing, cross-referencing, and worked examples in the text. Datasets and code are available for download on a supplementary website.
Table of Contents
Data Input and Output
Structure and metadata
Derived variables and data manipulation
Merging, combining, and subsetting datasets
Date and time variables
Statistical and Mathematical Functions
Probability distributions and random number generation
Programming and Operating System Interface
Control flow, programming, and data generation
Interactions with the operating system
Common Statistical Procedures
Tests for continuous variables
Analytic power and sample size calculations
Linear Regression and ANOVA
Tests, contrasts, and linear functions of parameters
Model results and diagnostics
Model parameters and results
Regression Generalizations and Modeling
Generalized linear models
Models for correlated data
Multivariate statistics and discriminant procedures
Complex survey design
Model selection and assessment
A Graphical Compendium
Univariate plots by grouping variable
Graphical Options and Configuration
Options and parameters
Processing by group
Simulation-based power calculations
Reproducible analysis and output
Advanced statistical methods
Data management and related tasks
Read variable format files
Manipulating bigger datasets
Constrained optimization: the knapsack problem
Appendix A: Introduction to R and RStudio
Appendix B: The HELP Study Dataset
Appendix C: References
Appendix D: Indices
Nicholas J. Horton is a professor of statistics at Amherst College. His research interests include longitudinal regression models and missing data methods, with applications in psychiatric epidemiology and substance abuse research.
Ken Kleinman is an associate professor in the Department of Population Medicine at Harvard Medical School. His research deals with clustered data analysis, surveillance, and epidemiological applications in projects ranging from vaccine and bioterrorism surveillance to observational epidemiology to individual-, practice-, and community-randomized interventions.
Featured Author Profiles
"The second edition of the book preserves the many good points of the first, and makes some improvements to the structure, e.g., on the graphical compendium. It also contains added material on more recent possibilities…is a good buy, if the goal is to have a reference book which allows to quickly find a way of accomplishing a task at hand in R, be it with or without RStudio."
— Ulrike Grömping, Beuth University of Applied Sciences Berlin, Journal of Statistical Software, November 2015
"… the book is easy to use. I have had it on my desk for the past few weeks and it has become invaluable. For those, like me, who find themselves regularly switching between R, MATLAB, and Python—or similar packages—it can save a lot of time."
—Significance Magazine, February 2016
Praise for the First Edition:
This book is an excellent reference resource. Used this way, it can be helpful for years to come for both experienced and novice users. The organization of the material makes it easy to find the relevant piece of information either by topic (from the table of contents) or using one of the indexes. The task entries are self-contained. Users with experience in technical computing may use it as a quick starter in R, as well.
—Georgi N. Boshnakov, Journal of Applied Statistics, June 2012
This book provides a concise reference and annotated examples for R … . It is needed because R does not come with a coordinated manual … It is much easier to find information in Horton and Kleinman’s book because of their more detailed indices and table of contents. … Horton and Kleinman have succeeded very well in their goal of providing a concise reference manual and annotated examples. If you know the statistics (or can look them up) and have some experience using R, it is an extremely useful reference, and it has become my most consulted R book. … it would be an excellent reference for those wanting look up the syntax of a command together with an example of how to use it. It is also very useful if you cannot remember the command and want to know how to do it in R.
—Paul H. Geissler, The American Statistician, November 2011
The interesting aspect of the book is that it does not only describe the basic statistics and graphics function of the basic R system but it describes the use of 40 additional available from the CRAN website. The website contains also the R code to install all the packages that contain the described features. In summary, the book is a useful complement to introductory statistics books and lectures … Those who know R might get additional hints on new features of statistical analyses.
—International Statistical Review (2011), 79