1st Edition

Telling Stories with Data With Applications in R

By Rohan Alexander Copyright 2023
    622 Pages 78 Color & 76 B/W Illustrations
    by Chapman & Hall

    622 Pages 78 Color & 76 B/W Illustrations
    by Chapman & Hall

    The book equips students with the end-to-end skills needed to do data science. That means gathering, cleaning, preparing, and sharing data, then using statistical models to analyse data, writing about the results of those models, drawing conclusions from them, and finally, using the cloud to put a model into production, all done in a reproducible way.

    At the moment, there are a lot of books that teach data science, but most of them assume that you already have the data. This book fills that gap by detailing how to go about gathering datasets, cleaning and preparing them, before analysing them. There are also a lot of books that teach statistical modelling, but few of them teach how to communicate the results of the models and how they help us learn about the world. Very few data science textbooks cover ethics, and most of those that do, have a token ethics chapter. Finally, reproducibility is not often emphasised in data science books. This book is based around a straight-forward workflow conducted in an ethical and reproducible way: gather data, prepare data, analyse data, and communicate those findings. This book will achieve the goals by working through extensive case studies in terms of gathering and preparing data, and integrating ethics throughout. It is specifically designed around teaching how to write about the data and models, so aspects such as writing are explicitly covered. And finally, the use of GitHub and the open-source statistical language R are built in throughout the book.

    Key Features:

    • Extensive code examples.
    • Ethics integrated throughout.
    • Reproducibility integrated throughout.
    • Focus on data gathering, messy data, and cleaning data.
    • Extensive formative assessment throughout.

    1. Telling stories with data  2. Drinking from a fire hose  3. Reproducible workflows  Part 1. Foundations  4. Writing research  5. Static communication  Part 2. Communication  6. Farm data  7. Gather data  8. Hunt data  Part 3. Acquisition  9. Clean and prepare  10. Store and share  Part 4. Preparation  11. Exploratory data analysis  12. Linear models  13. Generalized linear models  14. Causality from observational data  15. Multilevel regression with post-stratification  16. Text as data  17. Concluding remarks

    Biography

    Dr. Rohan Alexander is an assistant professor at the University of Toronto, jointly appointed in the Faculty of Information and the Department of Statistical Sciences. He is also the assistant director of CANSSI Ontario, a senior fellow at Massey College, a faculty affiliate at the Schwartz Reisman Institute for Technology and Society, and a co-lead of the DSI Thematic Program in Reproducibility. He holds a PhD in Economics from the Australian National University with a focus on economic history.

    "This clean and fun book covers a wide range of topics on statistical communication, programming, and modeling in a way that should be a useful supplement to any statistics course or self-learning program. I absolutely love this book!"
    - Andrew Gelman, Columbia University

    "An excellent book. Communication and reproducibility are of increasing concern in statistics, and this book covers these topics and more in a practical, appealing, and truly unique way."
    - Daniela Witten, University of Washington

    "Many data science texts tell you how to perform perfunctory calculations. Instead, Telling Stories with Data tells you how to engage in the mindset and process of analysis. By arming students with the computational, statistical and philosophical skills needed to use data in sense-making and story-telling, this book stands out from the pack as uniquely actionable and empowering."
    - Emily Riederer, Capital One

    "This is not another statistics book. It is much better than that. It is a book about doing quantitative research, about scientific justification, about quality control, about communication and epistemic humility. It's a valuable supplement to any methods curriculum, and useful for self-learners as well."
    Richard McElreathMax Planck Institute for Evolutionary Anthropology

    "Telling Stories with Data is a thoughtful guide to using data to learn and affect positive change. The book includes each stage of the process and can serve as a long-lasting companion to many data scientists and future data story tellers."
    Christopher PetersZapier

    “A clever career choice is to pick a field where your skills are complementary with a growing resource. In the coming decades, those who are adept in analysing data will flourish. That means crunching statistics and telling compelling stories. Rohan Alexander’s book will help you do both.”
    Andrew LeighMember of the Australian Parliament and author of Randomistas: How Radical Researchers Are Changing Our World

    "Every data analyst has to tell stories with data, and yet traditional textbooks focus on statistical methods alone.  Telling Stories with Data teaches the entire data science workflow, including data acquisition, communication, and reproducibility.  I highly recommend this unique book!"
    Kosuke ImaiHarvard University

    "This is an extraordinary, wonderful, book, full of wise advice for anyone starting in data science.  Intermixing concepts and code means the ideas are immediately made concrete, and the emphasis on reproducible workflows brings a welcome dose of rigor to a rapidly developing field."
    David SpiegelhalterThe University of Cambridge