Introduction to High Performance Computing for Scientists and Engineers: 1st Edition (Paperback) book cover

Introduction to High Performance Computing for Scientists and Engineers

1st Edition

By Georg Hager, Gerhard Wellein

CRC Press

356 pages | 143 B/W Illus.

Purchasing Options:$ = USD
Paperback: 9781439811924
pub: 2010-07-02
SAVE ~$18.19
Hardback: 9781138470897
pub: 2017-11-28
SAVE ~$41.00
eBook (VitalSource) : 9780429190612
pub: 2010-07-02
from $43.98

FREE Standard Shipping!


Written by high performance computing (HPC) experts, Introduction to High Performance Computing for Scientists and Engineers provides a solid introduction to current mainstream computer architecture, dominant parallel programming models, and useful optimization strategies for scientific HPC. From working in a scientific computing center, the authors gained a unique perspective on the requirements and attitudes of users as well as manufacturers of parallel computers.

The text first introduces the architecture of modern cache-based microprocessors and discusses their inherent performance limitations, before describing general optimization strategies for serial code on cache-based architectures. It next covers shared- and distributed-memory parallel computer architectures and the most relevant network topologies. After discussing parallel computing on a theoretical level, the authors show how to avoid or ameliorate typical performance problems connected with OpenMP. They then present cache-coherent nonuniform memory access (ccNUMA) optimization techniques, examine distributed-memory parallel programming with message passing interface (MPI), and explain how to write efficient MPI code. The final chapter focuses on hybrid programming with MPI and OpenMP.

Users of high performance computers often have no idea what factors limit time to solution and whether it makes sense to think about optimization at all. This book facilitates an intuitive understanding of performance limitations without relying on heavy computer science knowledge. It also prepares readers for studying more advanced literature.

Read about the authors’ recent honor: Informatics Europe Curriculum Best Practices Award for Parallelism and Concurrency


Georg Hager and Gerhard Wellein have developed a very approachable introduction to high performance computing for scientists and engineers. Their style and description is easy to read and follow. … This book presents a balanced treatment of the theory, technology, architecture, and software for modern high performance computers and the use of high performance computing systems. The focus on scientific and engineering problems makes this both educational and unique. I highly recommend this timely book for scientists and engineers. I believe this book will benefit many readers and provide a fine reference.

—From the Foreword by Jack Dongarra, University of Tennessee, Knoxville, USA

Table of Contents

Modern Processors

Stored-program computer architecture

General-purpose cache-based microprocessor architecture

Memory hierarchies

Multicore processors

Multithreaded processors

Vector processors

Basic Optimization Techniques for Serial Code

Scalar profiling

Common sense optimizations

Simple measures, large impact

The role of compilers

C++ optimizations

Data Access Optimization

Balance analysis and lightspeed estimates

Storage order

Case study: The Jacobi algorithm

Case study: Dense matrix transpose

Algorithm classification and access optimizations

Case study: Sparse matrix-vector multiply

Parallel Computers

Taxonomy of parallel computing paradigms

Shared-memory computers

Distributed-memory computers

Hierarchical (hybrid) systems


Basics of Parallelization

Why parallelize?


Parallel scalability

Shared-Memory Parallel Programming with OpenMP

Short introduction to OpenMP

Case study: OpenMP-parallel Jacobi algorithm

Advanced OpenMP: Wavefront parallelization

Efficient OpenMP Programming

Profiling OpenMP programs

Performance pitfalls

Case study: Parallel sparse matrix-vector multiply

Locality Optimizations on ccNUMA Architectures

Locality of access on ccNUMA

Case study: ccNUMA optimization of sparse MVM

Placement pitfalls

ccNUMA issues with C++

Distributed-Memory Parallel Programming with MPI

Message passing

A short introduction to MPI

Example: MPI parallelization of a Jacobi solver

Efficient MPI Programming

MPI performance tools

Communication parameters

Synchronization, serialization, contention

Reducing communication overhead

Understanding intranode point-to-point communication

Hybrid Parallelization with MPI and OpenMP

Basic MPI/OpenMP programming models

MPI taxonomy of thread interoperability

Hybrid decomposition and mapping

Potential benefits and drawbacks of hybrid programming

Appendix A: Topology and Affinity in Multicore Environments

Appendix B: Solutions to the Problems



About the Authors

Georg Hager is a senior research scientist in the high performance computing group of the Erlangen Regional Computing Center at the University of Erlangen-Nuremberg in Germany.

Gerhard Wellein leads the high performance computing group of the Erlangen Regional Computing Center and is a professor in the Department for Computer Science at the University of Erlangen-Nuremberg in Germany.

About the Series

Chapman & Hall/CRC Computational Science

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Programming / Games
MATHEMATICS / Number Systems