Data Science : An Introduction book cover
1st Edition

Data Science
An Introduction

  • Available for pre-order. Item will ship after April 11, 2022
ISBN 9780367524685
April 11, 2022 Forthcoming by Chapman and Hall/CRC
456 Pages 219 Color Illustrations

USD $59.95

Prices & shipping based on shipping country


Book Description

Data Science: An Introduction focuses on using the R programming language in Jupyter notebooks to perform basic data manipulation and cleaning, create effective visualizations, and extract insights from data using supervised predictive models.

Based on sound educational research and active learning principles, the book uses a modern approach to the R programming language and accompanying sheets for self-directed learning this book will leave students well-prepared for data science projects.

Data Science: An Introduction focuses on workflows and communication strategies that are clear, reproducible, and shareable. Aimed at first year undergraduates with only minimal prior knowledge of mathematics and programming this book is suitable for students across many disciplines.

All source code is available online as a GitHub repository, demonstrating the use of good reproducible and clear project workflows and is also accompanied by autograded Jupyter worksheets, providing the reader with guided interactive instruction.

Table of Contents



About the Authors

Chapter 1 R and the tidyverse

Chapter 2 Reading in data locally and from the web

Chapter 3 Cleaning and wrangling data

Chapter 4 Effective data visualization

Chapter 5 Classification I: training & predicting

Chapter 6 Classification II: evaluation & tuning

Chapter 7 Regression I: K-nearest neighbors

Chapter 8 Regression II: linear regression

Chapter 9 Clustering

Chapter 10 Statistical inference

Chapter 11 Combining code and text with Jupyter

Chapter 12 Collaboration with version control

Chapter 13 Setting up your computer



View More



Tiffany Timbers is an Assistant Professor of Teaching in the Department of Statistics and Co-Director for the Master of Data Science program (Vancouver Option) at the University of British Columbia.

Trevor Campbell is an Assistant Professor in the Department of Statistics at the University of British Columbia.

Melissa Lee is an Assistant Professor of Teaching in the Department of Statistics at the University of British Columbia


'Many students leave school with a thorough understanding of core statistical theories and machine learning algorithms but a limited sense for how to put these ideas into practice. Real data science work entails a far broader set of skills including communication, collaboration, technical project management, and rapid iteration. Data Science: a First Introduction targets this gap by previewing this broader set of topics. By including less often discussed concepts like version control and modeling pipelines, that are often neglected at the introductory level, this book will help students build the right 'muscles' from the beginning of their studies and convert their knowledge into practice.'

-Emily Riederer, Capitol One