1st Edition

Exploring Data Science with R and the Tidyverse A Concise Introduction

By Jerry Bonnell, Mitsunori Ogihara Copyright 2024
    492 Pages 158 Color & 42 B/W Illustrations
    by Chapman & Hall

    492 Pages 158 Color & 42 B/W Illustrations
    by Chapman & Hall

    492 Pages 158 Color & 42 B/W Illustrations
    by Chapman & Hall

    This book introduces the reader to data science using R and the tidyverse. No prerequisite knowledge is needed in college-level programming or mathematics (e.g., calculus or statistics). The book is self-contained so readers can immediately begin building data science workflows without needing to reference extensive amounts of external resources for onboarding. The contents are targeted for undergraduate students but are equally applicable to students at the graduate level and beyond. The book develops concepts using many real-world examples to motivate the reader.

    Upon completion of the text, the reader will be able to:

    • Gain proficiency in R programming
    • Load and manipulate data frames, and "tidy" them using tidyverse tools
    • Conduct statistical analyses and draw meaningful inferences from them
    • Perform modeling from numerical and textual data
    • Generate data visualizations (numerical and spatial) using ggplot2 and understand what is being represented

    An accompanying R package "edsdata" contains synthetic and real datasets used by the textbook and is meant to be used for further practice. An exercise set is made available and designed for compatibility with automated grading tools for instructor use.

    1. Data Types  2. Data Transformation  3. Data Visualization  4. Building Simulations  5. Sampling  6. Hypothesis Testing  7. Quantifying Uncertainty  8. Towards Normality  9. Regression  10. Text Analysis


    Jerry Bonnell is a Ph.D. candidate in Computer Science at the University of Miami, and a University of Miami Fellow. His research areas include Machine Learning and Natural Language Processing. His research targets domain experts in Digital Humanities (DH) and is shown in publications in Digital Scholarship in the Humanities, Digital Humanities Quarterly, and the ACL Special Interest Group on Language Technologies for the Socio-Economic Sciences and Humanities (LaTeCH-CLfL). He also teaches the undergraduate-level course "Data Science for the World" at the University of Miami.

    Mitsunori Ogihara is a Professor of Computer Science at the University of Miami, Coral Gables, Florida, USA, and is the Director of its Master of Science in Data Science program. He received his Ph.D. in Information Sciences from Tokyo Institute of Technology, Tokyo, Japan. He is an author/co-author/co-editor of four books and has published more than 200 peer-reviewed journals and conference papers. He serves on the editorial board for a few academic journals, including Theory of Computing Systems, Springer, for which he is the Editor-in-Chief.