A Tour of Data Science : Learn R and Python in Parallel book cover
1st Edition

A Tour of Data Science
Learn R and Python in Parallel

ISBN 9780367895860
Published November 12, 2020 by Chapman & Hall
216 Pages 25 B/W Illustrations

FREE Standard Shipping
USD $59.95

Prices & shipping based on shipping country


Book Description

A Tour of Data Science: Learn R and Python in Parallel covers the fundamentals of data science, including programming, statistics, optimization, and machine learning in a single short book. It does not cover everything, but rather, teaches the key concepts and topics in Data Science. It also covers two of the most popular programming languages used in Data Science, R and Python, in one source.

Key features:

  • Allows you to learn R and Python in parallel
  • Cover statistics, programming, optimization and predictive modelling, and the popular data manipulation tools – data.table and pandas
  • Provides a concise and accessible presentation
  • Includes machine learning algorithms implemented from scratch, linear regression, lasso, ridge, logistic regression, gradient boosting trees, etc.

Appealing to data scientists, statisticians, quantitative analysts, and others who want to learn programming with R and Python from a data science perspective.

Table of Contents

Assumptions about the reader’s background
Book overview 

Introduction to R/Python Programming 

Variable and Type
Control flows
Some built-in data structures 
Revisit of variables 
Object-oriented programming (OOP) in R/Python 

More on R/Python Programming 
Work with R/Python scripts 
Debugging in R/Python 
Embarrassingly parallelism in R/Python 
Evaluation strategy
Speed up with C/C++ in R/Python
A first impression of functional programming Miscellaneous 

data.table and pandas
Get started with data.table and pandas 
Indexing & selecting data 
Group by 

Random Variables, Distributions & Linear Regression 
A refresher on distributions 
Inversion sampling & rejection sampling 
Joint distribution & copula 
Fit a distribution 
Confidence interval
Hypothesis testing 
Basics of linear regression 
Ridge regression 

Optimization in Practice
Gradient descent 
General purpose minimization tools in R/Python 
Linear programming 

Machine Learning - A gentle introduction 
Supervised learning 
Gradient boosting machine 
Unsupervised learning 
Reinforcement learning 
Deep Q-Networks 
Computational differentiation 

View More



Nailong Zhang is lead Data Scientist at Mass Mutual Life Insurance Company.