Parallel Computing for Data Science With Examples in R, C++ and CUDA
Parallel Computing for Data Science: With Examples in R, C++ and CUDA is one of the first parallel computing books to concentrate exclusively on parallel data structures, algorithms, software tools, and applications in data science. It includes examples not only from the classic "n observations, p variables" matrix format but also from time series, network graph models, and numerous other structures common in data science. The examples illustrate the range of issues encountered in parallel programming.
With the main focus on computation, the book shows how to compute on three types of platforms: multicore systems, clusters, and graphics processing units (GPUs). It also discusses software packages that span more than one type of hardware and can be used from more than one type of programming language. Readers will find that the foundation established in this book will generalize well to other languages, such as Python and Julia.
Introduction to Parallel Processing in R. "Why Is My Program So Slow?": Obstacles to Speed. Principles of Parallel Loop Scheduling. The Shared Memory Paradigm: A Gentle Introduction through R. The Shared Memory Paradigm: C Level. The Shared Memory Paradigm: GPUs. Thrust and Rth. The Message Passing Paradigm. MapReduce Computation. Parallel Sorting and Merging. Parallel Prefix Scan. Parallel Matrix Operations. Inherently Statistical Approaches: Subset Methods. Appendices.
"From my reading of the book, Matloff achieves his goals, and in doing so he has provided a volume that will be immensely useful to a very wide audience. I can see it being used as a reference by data analysts, statisticians, engineers, econometricians, biometricians, etc. This would apply to both established researchers and graduate students. This book provides exactly the sort of information that this audience is looking for, and it is presented in a very accessible and friendly manner."
—Econometrics Beat: Dave Giles’ Blog, July 2015
"The author has correctly recognized that there is a pressing need for a thorough, but readable guide to parallel computing—one that can be used by researchers and students in a wide range of disciplines. In my view, this book will meet that need. … For me and colleagues in my field, I would see this as a ‘must-have’ reference book—one that would be well thumbed!"
—David E. Giles, University of Victoria
"This is a book that I will use, both as a reference and for instruction. The examples are poignant and the presentation moves the reader directly from concept to working code."
—Michael Kane, Yale University
"Matloff’s Parallel Computing for Data Science: With Examples in R, C++ and CUDA can be recommended to colleagues and students alike, and the author is to be congratulated for taming a difficult and exhaustive body of topics via a very accessible primer."
—Dirk Eddelbuettel, Debian and R Projects