See How Graphics Reveal Information
Graphical Data Analysis with R shows you what information you can gain from graphical displays. The book focuses on why you draw graphics to display data and which graphics to draw (and uses R to do so). All the datasets are available in R or one of its packages and the R code is available at rosuda.org/GDA.
Graphical data analysis is useful for data cleaning, exploring data structure, detecting outliers and unusual groups, identifying trends and clusters, spotting local patterns, evaluating modelling output, and presenting results. This book guides you in choosing graphics and understanding what information you can glean from them. It can be used as a primary text in a graphical data analysis course or as a supplement in a statistics course. Colour graphics are used throughout.
Table of Contents
Setting the Scene
Graphics in action
What is graphical data analysis (GDA)?
Using this book, the R code in it, and the book’s webpage
Brief Review of the Literature and Background Materials
Other graphics software
Examining Continuous Variables
What features might continuous variables have?
Looking for features
Comparing distributions by subgroups
What plots are there for individual continuous variables?
Modelling and testing for continuous variables
Displaying Categorical Data
What features might categorical variables have?
Nominal data—no fixed category order
Ordinal data—fixed category order
Discrete data—counts and integers
Formats, factors, estimates, and barcharts
Modelling and testing for categorical variables
Looking for Structure: Dependency Relationships and Associations
What features might be visible in scatterplots?
Looking at pairs of continuous variables
Adding models: lines and smooths
Comparing groups within scatterplots
Scatterplot matrices for looking at many pairs of variables
Modelling and testing for relationships between variables
Investigating Multivariate Continuous Data
What is a parallel coordinate plot (pcp)?
Features you can see with parallel coordinate plots
Interpreting clustering results
Parallel coordinate plots and time series
Parallel coordinate plots for indices
Options for parallel coordinate plots
Modelling and testing for multivariate continuous data
Parallel coordinate plots and comparing model results
Studying Multivariate Categorical Data
Data on the sinking of the Titanic
What is a mosaicplot?
Different mosaicplots for different questions of interest
Which mosaicplot is the right one?
Modelling and testing for multivariate categorical data
Getting an Overview
Many individual displays
Multivariate overviews for categorical variables
Graphics by group
Modelling and testing for overviews
Graphics and Data Quality: How Good Are the Data?
Modelling and testing for data quality
Comparisons, Comparisons, Comparisons
Making visual comparisons
Comparing group effects graphically
Comparing rates visually
Graphics for comparing many subsets
Graphics principles for comparisons
Modelling and testing for comparisons
Graphics for Time Series
Graphics for a single time series
Special features of time series
Alternative graphics for time series
R classes and packages for time series
Modelling and testing time series
Ensemble Graphics and Case Studies
What is an ensemble of graphics?
Combining different views—a case study example
Some Notes on Graphics with R
Graphics systems in R
Loading datasets and packages for graphical analysis
Graphics conventions in statistics
What is a graphic anyway?
Options for all graphics
Some R graphics advice and coding tips
Data analysis and graphics
Key features of GDA
Strengths and weaknesses of GDA
Recommendations for GDA
Antony Unwin is a professor of computer-oriented statistics and data analysis at the University of Augsburg. He is a fellow of the American Statistical Society, co-author of Graphics of Large Datasets, and co-editor of the Handbook of Data Visualization. His research focuses on data visualisation, especially in interactive graphics. His research group has developed several pieces of interactive graphics software and written packages for R.
". . . the book follows a learning-by-doing approach.With numerous examples, the author shows how important qualitative aspects of data can be detected by means of simple plots, and how a few simple changes in a graph may uncover relevant information not visible before, setting aside the more technical aspects of plots in R. Still, for each graph, the respective R-code is provided in the book, and complete programme codes for the examples ae available on the book’s webpage. Thus, by copy and paste, one can easily rescale all graphs, change the aspect ratio and apply other modifications to the original plot. This blended-learning approach facilitates exploring the data graphically without requiring too much knowledge of R syntax. This book is therefore well suited for students and novice data analysts who want to learn from examples. It could also supplement theoretical statistics courses, and help statistics teachers in finding suitable graphical displays for various purposes."
—Jasmin Wachter, Universität Klagenfurt
"Overall, the book is a very good introduction to the practical side of graphical data analysis using R. The presentation of R code and graphics output is excellent, with colours used when required. The book appears to be free of typographical and other errors, and its index is useful. Also, the book is well written and neatly structured. I enjoyed reading the book and can recommend it to anyone who wants to learn more about their data through graphics using R. It will also be a valuable asset for a library and as part of an undergraduate course in applied statistics."
—Journal of the Royal Statistical Society, Series A
"Throughout, the book follows a learning-by-doing approach. With numerous examples, the author shows how important qualitative aspects of data can be detected by means of simple plots, and how a few simple changes in a graph may uncover relevant information not visible before, setting aside the more technical aspects of plots in R. Still, for each graph, the respective R-code is provided in the book, and complete programme codes for the examples ae available on the book’s webpage. … This blended-learning approach facilitates exploring the data graphically without requiring too much knowledge of R syntax. This book is therefore well suited for students and novice data analysts who want to learn from examples. It could also supplement theoretical statistics courses, and help statistics teachers in finding suitable graphical displays for various purposes."
—Statistical Papers, 2017
"… an attractive addition to the current statistical graphics texts as it demonstrates what can be learned through graphs."
—Significance Magazine, February 2016
"… the strength of this book lies in the profound introduction to the topic of graphical data analysis. The comprehensive sectional introductions and overviews along with the ‘how-to’ might well be regarded as the modern update to Tukey’s 1977 landmark book."
—Biometrical Journal, December 2015
"Antony Unwin’s very clever new book … is well written, clearly by a practitioner with wide experience, gives generally good (though sometimes opinionated) advice, and includes R code for nearly all examples, as well as nice collections of additional exercises for each chapter … Beyond the content, Unwin also does an admirable job of conveying enthusiasm for data graphics."
—Journal of Educational and Behavioral Statistics, December 2015
"This text has the potential of bringing sophisticated visualization to a broad audience without resorting to mathematical formalizations or the skills of a graphics artist. It engages the reader with interesting graphics right from the start and overall is clear and unintimidating. Code for all examples is provided in the text and is available on a supporting website. What’s more, the code works as is, rather unusual and refreshing."
—Journal of Statistical Software, November 2015
"For statisticians and experts in data analysis, the book is without doubt the new reference work on the subject."
—Thomas Rahlf, datendesign-r.de
...would also be an excellent suggested additional reading for a pragmatic graphical data analysis-oriented course.
—Reijo Sund, Centre for Research Methods, University of Helsinki