826 Pages 254 Color & 158 B/W Illustrations
by Chapman & Hall

Unlike the first edition, the new edition has been split into two books, which have been brought together in this set. Thoroughly revised and updated, the first book ( Introduction to Data Science: Data Wrangling and Visualization with R ) introduces skills that can help the reader tackle real-world data analysis challenges. These include R programming, data wrangling with dplyr, data... Read more

Vol 1 Preface Acknowledgements Introduction Part 1: R 1. Getting started 2. R basics 3. Programming basics 4. The tidyverse 5. data.table 6. Importing data

Part 2: Data Visualization 7. Visualizing data distributions 8. ggplot2 9. Data visualization principles 10. Data visualization in practice

Part 3: Data Wrangling 11. Reshaping data 12. Joining tables 13. Parsing dates and times 14. Locales 15. Extracting data from the web 16. String processing 17. Text analysis

Part 4: Productivity Tools 18. Organizing with Unix 19. Git and GitHub 20. Reproducible projects

Vol 2 Distributions Numerical Summaries Comparing Groups Connecting Data and Probability Discrete Probability Continuous Probability Random Variables Sampling Models and the Central Limit Theorem Estimates and Confidence Intervals Data-Driven Models Bayesian Statistics Hierarchical Models Hypothesis Testing Bootstrap Introduction to Regression The Linear Model Framework Treatment Effect Models Generalized Linear Models Association Is Not Causation Multivariable Regression Working with Matrices in R Applied Linear Algebra Dimension Reduction Regularization Latent Factor Models Notation and Terminology Performance Metrics Conditional Expectations and Smoothing Resampling and Model Assessment Supervised Learning Methods Building Machine Learning Models Unsupervised Learning: Clustering

Biography

Rafael A. Irizarry is Professor and Chair of the Department of Data Science at Dana-Farber Cancer Institute and Professor of Applied Statistics at Harvard. His research focuses on Genomics and he has taught several Data Science courses.