Designed for introductory parallel computing courses at the advanced undergraduate or beginning graduate level, Elements of Parallel Computing presents the fundamental concepts of parallel computing not from the point of view of hardware, but from a more abstract view of algorithmic and implementation patterns. The aim is to facilitate the teaching of parallel programming by surveying some key algorithmic structures and programming models, together with an abstract representation of the underlying hardware. The presentation is friendly and informal. The content of the book is language neutral, using pseudocode that represents common programming language models.
The first five chapters present core concepts in parallel computing. SIMD, shared memory, and distributed memory machine models are covered, along with a brief discussion of what their execution models look like. The book also discusses decomposition as a fundamental activity in parallel algorithmic design, starting with a naive example, and continuing with a discussion of some key algorithmic structures. Important programming models are presented in depth, as well as important concepts of performance analysis, including work-depth analysis of task graphs, communication analysis of distributed memory algorithms, key performance metrics, and a discussion of barriers to obtaining good performance.
The second part of the book presents three case studies that reinforce the concepts of the earlier chapters. One feature of these chapters is to contrast different solutions to the same problem, using select problems that aren't discussed frequently in parallel computing textbooks. They include the Single Source Shortest Path Problem, the Eikonal equation, and a classical computational geometry problem: computation of the two-dimensional convex hull. After presenting the problem and sequential algorithms, each chapter first discusses the sources of parallelism then surveys parallel algorithms.
Overview of Parallel Computing
INTRODUCTION
TERMINOLOGY
EVOLUTION OF PARALLEL COMPUTERS
EXAMPLE: WORD COUNT
PARALLEL PROGRAMMING MODELS
Implicit Models
Semi-Implicit Models
Explicit Models
Thinking in Parallel
PARALLEL DESIGN PATTERNS
Structural Patterns
Computational Patterns
Patterns in the Lower Layers
WORD COUNT IN PARALLEL
OUTLINE OF THE BOOK
Parallel Machine and Execution Models
PARALLEL MACHINE MODELS
SIMD
Shared Memory and Distributed Memory Computers
Distributed Memory Execution
Shared Memory Execution
Summary
PARALLEL EXECUTION MODEL
Task Graph Model
Examples
Summary
FURTHER READING
EXERCISES
Parallel Algorithmic Structures
HISTOGRAM EXAMPLE
Guidelines for Parallel Algorithm Design
EMBARRASSINGLY PARALLEL
REDUCTION
SCAN
DIVIDE AND CONQUER
PIPELINE
DATA DECOMPOSITION
SUMMARY
FURTHER READING
EXERCISES
Parallel Program Structures
LOAD BALANCE
SIMD: STRICTLY DATA PARALLEL
FORKJOIN
PARALLEL LOOPS AND SYNCHRONIZATION
Shared and Private Variables
Synchronization
Thread Safety
TASKS WITH DEPENDENCIES
SINGLE PROGRAM MULTIPLE DATA
MASTERWORKER
DISTRIBUTED MEMORY PROGRAMMING
Distributed Arrays
Message Passing
Map-Reduce
CONCLUSION
FURTHER READING
EXERCISES
Performance Analysis and Optimization
WORKDEPTH
ANALYSIS
PERFORMANCE ANALYSIS
Performance Metrics
Communication Analysis
BARRIERS TO PERFORMANCE
MEASURING AND REPORTING PERFORMANCE
FURTHER READING
EXERCISES
Single Source Shortest Path
SEQUENTIAL ALGORITHMS
Data Structures
Bellman-Ford Algorithm
Dijkstra's Algorithm
Delta-Stepping Algorithm
PARALLEL DESIGN EXPLORATION
PARALLEL ALGORITHMS
Shared Memory Delta-Stepping
SIMD Bellman-Ford for GPU
Message Passing Algorithm
CONCLUSION
FURTHER READING
EXERCISES
The Eikonal Equation
NUMERICAL SOLUTION
Fast Sweeping Method
Fast Marching Method
PARALLEL DESIGN EXPLORATION
Parallel Fast Sweeping Methods
Parallel Fast Marching Methods
PARALLEL ALGORITHMS
Parallel Fast Sweeping Methods
Parallel Fast Marching Methods
FURTHER READING
EXERCISES
Planar Convex Hull
SEQUENTIAL ALGORITHMS
PARALLEL DESIGN EXPLORATION
Parallel Hull Merge
PARALLEL ALGORITHMS
SIMD QuickHull
Coarse-grained Shared Memory MergeHull
Distributed Memory MergeHull
CONCLUSION
FURTHER READING
EXERCISES
Index
Biography
Eric Aubanel is a Professor in the Faculty of Computer Science at the University of New Brunswick, Fredericton , C anada. His area of research is High Performance Computing. He is part of the IBM Center for Advanced Studies - Atlantic and associated with the Atlantic Computational Excellence Network (ACEnet). His research is funded by NSERC.