Parallel Computing for Data Science: With Examples in R, C++ and CUDA (Hardback) book cover

Parallel Computing for Data Science

With Examples in R, C++ and CUDA

By Norman Matloff

© 2015 – Chapman and Hall/CRC

328 pages | 7 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781466587014
pub: 2015-06-04
SAVE ~$13.19
eBook (VitalSource) : 9781466587038
pub: 2015-06-04
from $28.98

FREE Standard Shipping!


Parallel Computing for Data Science: With Examples in R, C++ and CUDA is one of the first parallel computing books to concentrate exclusively on parallel data structures, algorithms, software tools, and applications in data science. It includes examples not only from the classic "n observations, p variables" matrix format but also from time series, network graph models, and numerous other structures common in data science. The examples illustrate the range of issues encountered in parallel programming.

With the main focus on computation, the book shows how to compute on three types of platforms: multicore systems, clusters, and graphics processing units (GPUs). It also discusses software packages that span more than one type of hardware and can be used from more than one type of programming language. Readers will find that the foundation established in this book will generalize well to other languages, such as Python and Julia.


"From my reading of the book, Matloff achieves his goals, and in doing so he has provided a volume that will be immensely useful to a very wide audience. I can see it being used as a reference by data analysts, statisticians, engineers, econometricians, biometricians, etc. This would apply to both established researchers and graduate students. This book provides exactly the sort of information that this audience is looking for, and it is presented in a very accessible and friendly manner."

—Econometrics Beat: Dave Giles’ Blog, July 2015

"The author has correctly recognized that there is a pressing need for a thorough, but readable guide to parallel computing—one that can be used by researchers and students in a wide range of disciplines. In my view, this book will meet that need. … For me and colleagues in my field, I would see this as a ‘must-have’ reference book—one that would be well thumbed!"

—David E. Giles, University of Victoria

"This is a book that I will use, both as a reference and for instruction. The examples are poignant and the presentation moves the reader directly from concept to working code."

—Michael Kane, Yale University

Table of Contents

Introduction to Parallel Processing in R

Recurring Theme: The Principle of Pretty Good Parallelism

A Note on Machines

Recurring Theme: Hedging One's Bets

Extended Example: Mutual Web Outlinks

"Why Is My Program So Slow?": Obstacles to Speed

Obstacles to Speed

Performance and Hardware Structures

Memory Basics

Network Basics

Latency and Bandwidth

Thread Scheduling

How Many Processes/Threads?

Example: Mutual Outlink Problem

"Big O" Notation

Data Serialization

"Embarrassingly Parallel" Applications

Principles of Parallel Loop Scheduling

General Notions of Loop Scheduling

Chunking in Snow

A Note on Code Complexity

Example: All Possible Regressions

The partools Package

Example: All Possible Regressions, Improved Version

Introducing Another Tool: multicore

Issues with Chunk Size

Example: Parallel Distance Computation

The foreach Package


Another Scheduling Approach: Random Task Permutation

Debugging snow and multicore Code

The Shared Memory Paradigm: A Gentle Introduction through R

So, What Is Actually Shared?

Clarity and Conciseness of Shared-Memory Programming

High-Level Introduction to Shared-Memory Programming: Rdsm Package

Example: Matrix Multiplication

Shared Memory Can Bring a Performance Advantage

Locks and Barriers

Example: Finding the Maximal Burst in a Time Series

Example: Transformation of an Adjacency Matrix

Example: k-Means Clustering

The Shared Memory Paradigm: C Level


Example: Finding the Maximal Burst in a Time Series

OpenMP Loop Scheduling Options

Example: Transformation an Adjacency Matrix

Example: Transforming an Adjacency Matrix, R-Callable Code

Speedup in C

Run Time vs. Development Time

Further Cache/Virtual Memory Issues

Reduction Operations in OpenMP


Intel Thread Building Blocks (TBB)

Lockfree Synchronization

The Shared Memory Paradigm: GPUs


Another Note on Code Complexity

Goal of This Chapter

Introduction to NVIDIA GPUs and CUDA

Example: Mutual Inlinks Problem

Synchronization on GPUs

R and GPUs

The Intel Xeon Phi Chip

Thrust and Rth

Hedging One's Bets

Thrust Overview


Skipping the C++

Example: Finding Quantiles

Introduction to Rth

The Message Passing Paradigm

Message Passing Overview

The Cluster Model

Performance Issues


Example: Pipelined Method for Finding Primes

Memory Allocation Issues

Message-Passing Performance Subtleties

MapReduce Computation

Apache Hadoop

Other MapReduce Systems

R Interfaces to MapReduce Systems

An Alternative: "Snowdoop"

Parallel Sorting and Merging

The Elusive Goal of Optimality

Sorting Algorithms

Example: Bucket Sort in R

Example: Quicksort in OpenMP

Sorting in Rth

Some Timing Comparisons

Sorting on Distributed Data

Parallel Prefix Scan

General Formulation


General Strategies for Parallel Scan Computation

Implementations of Parallel Prefix Scan

Parallel cumsum() with OpenMP

Example: Moving Average

Parallel Matrix Operations

Tiled Matrices

Example: Snowdoop Approach to Matrix Operations

Parallel Matrix Multiplication

BLAS Libraries

Example: A Look at the Performance of OpenBLAS

Example: Graph Connectedness

Solving Systems of Linear Equations

Sparse Matrices

Inherently Statistical Approaches: Subset Methods

Chunk Averaging

Bag of Little Bootstraps

Subsetting Variables

Appendix A: Review of Matrix Algebra

Appendix B: R Quick Start

Appendix C: Introduction to C for R Programmers

About the Author

Dr. Norman Matloff is a professor of computer science at the University of California, Davis, where he was a founding member of the Department of Statistics. He is a statistical consultant and a former database software developer. He has published numerous articles in prestigious journals, such as the ACM Transactions on Database Systems, ACM Transactions on Modeling and Computer Simulation, Annals of Probability, Biometrika, Communications of the ACM, and IEEE Transactions on Data Engineering. He earned a PhD in pure mathematics from UCLA, specializing in probability/functional analysis and statistics.

About the Series

Chapman & Hall/CRC The R Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
MATHEMATICS / Number Systems
MATHEMATICS / Probability & Statistics / General