Multicore Computing: Algorithms, Architectures, and Applications (Hardback) book cover

Multicore Computing

Algorithms, Architectures, and Applications

Edited by Sanguthevar Rajasekaran, Lance Fiondella, Mohamed Ahmed, Reda A. Ammar

© 2013 – Chapman and Hall/CRC

452 pages | 231 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781439854341
pub: 2013-12-12
eBook (VitalSource) : 9781439854358
pub: 2013-12-12
from $75.95

FREE Standard Shipping!
e–Inspection Copy


Every area of science and engineering today has to process voluminous data sets. Using exact, or even approximate, algorithms to solve intractable problems in critical areas, such as computational biology, takes time that is exponential in some of the underlying parameters. Parallel computing addresses this issue and has become affordable with the advent of multicore architectures. However, programming multicore machines is much more difficult due to oddities existing in the architectures.

Offering insights into different facets of this area, Multicore Computing: Algorithms, Architectures, and Applications focuses on the architectures, algorithms, and applications of multicore computing. It will help readers understand the intricacies of these architectures and prepare them to design efficient multicore algorithms.

Contributors at the forefront of the field cover the memory hierarchy for multicore and manycore processors, the caching strategy Flexible Set Balancing, the main features of the latest SPARC architecture specification, the Cilk and Cilk++ programming languages, the numerical software library Parallel Linear Algebra Software for Multicore Architectures (PLASMA), and the exact multipattern string matching algorithm of Aho-Corasick. They also describe the architecture and programming model of the NVIDIA Tesla GPU, discuss scheduling directed acyclic graphs onto multi/manycore processors, and evaluate design trade-offs among Intel and AMD multicore processors, IBM Cell Broadband Engine, and NVIDIA GPUs. In addition, the book explains how to design algorithms for the Cell Broadband Engine and how to use the backprojection algorithm for generating images from synthetic aperture radar data.

Table of Contents

Memory Hierarchy for Multicore and Manycore Processors

Mohamed Zahran and Bushra Ahsan

Design Issues

Physical Memory

Cache Hierarchy Organization

Cache Hierarchy Sharing

Cache Hierarchy Optimization

Cache Coherence

Support for Memory Consistency Models

Cache Hierarchy in Light of New Technologies

Concluding Remarks

FSB: A Flexible Set Balancing Strategy for Last Level Caches

Mohammad Hammoud, Sangyeun Cho, and Rami Melhem


Motivation and Background

Flexible Set Balancing (FSB)

Quantitative Evaluation

Related Work

Conclusions and Future Work

The SPARC Processor Architecture

Simone Secchi, Antonino Tumeo, and Oreste Villa


The SPARC Instruction Set Architecture

Memory Access


The NIAGARA Processor Architecture

Core Micro-Architecture

Core Interconnection

Memory Subsystem

Niagara Evolutions

The Cilk and Cilk++ Programming Languages

Hans Vandierendonck



The Cilk Language


Analyzing Parallelism in Cilk Programs



Multithreading in the PLASMA Library

Jakub Kurzak, Piotr Luszczek, Asim YarKhan, Mathieu Faverge, Julien Langou, Henricus Bouwmeester, and Jack Dongarra


Multithreading in PLASMA

Dynamic Scheduling with QUARK

Parallel Composition

Task Aggregation

Nested Parallelism

Efficient Aho-Corasick String Matching on Emerging Multicore Architectures

Antonino Tumeo, Oreste Villa, Simone Secchi, and Daniel Chavarria-Miranda


Related Work


Algorithm Design

Experimental Results


Sorting on a Graphics Processing Unit (GPU)

Shibdas Bandyopadhyay and Sartaj Sahni

Graphics Processing Units

Sorting Numbers on GPUs

Sorting Records on GPUs

Scheduling DAG Structured Computations

Yinglong Xia and Viktor K. Prasanna



Related Work

Lock-Free Collaborative Scheduling

Hierarchical Scheduling with Dynamic Thread Grouping


Evaluating Multicore Processors and Accelerators for Dense Numerical Computations

Seunghwa Kang, Nitin Arora, Aashay Shringarpure, Richard W. Vuduc, and David A. Bader


Interarchitectural Design Trade-Offs

Descriptions and Qualitative Analysis of Computational Statistics Kernels

Baseline Architecture-Specific Implementations for the Computational Statistics Kernels

Experimental Results for the Computational Statistics Kernels

Descriptions and Qualitative Analysis of Direct N-Body Kernels

Direct N-Body Implementations

Experimental Results and Discussion for the Direct N-Body Implementations


Sorting on the Cell Broadband Engine

Shibdas Bandyopadhyay, Dolly Sharma, Reda A. Ammar, Sanguthevar Rajasekaran, and Sartaj Sahni

The Cell Broadband Engine

High-level Strategies for Sorting

SPU Vector and Memory Operations

Sorting Numbers

Sorting Records

GPU Matrix Multiplication

Junjie Li, Sanjay Ranka, and Sartaj Sahni


GPU Architecture

Programming Model


Single Core Matrix Multiply

Multicore Matrix Multiply

GPU Matrix Multiply

A Comparison

Backprojection Algorithms for Multicore and GPU Architectures

William Chapman, Sanjay Ranka, Sartaj Sahni, Mark Schmalz, Linda Moore, Uttam Majumder, and Bracy Elton

Summary of Backprojection

Partitioning Backprojection for Implementation on a GPU

Single Core Backprojection

GPU Backprojection




About the Editors

Sanguthevar Rajasekaran is the UTC Chair Professor of Computer Science and Engineering and director of the Booth Engineering Center for Advanced Technologies at the University of Connecticut. He received a Ph.D. in computer science from Harvard University. He is a fellow of the IEEE and the AAAS and an elected member of the Connecticut Academy of Science and Engineering. His research interests include bioinformatics, parallel algorithms, data mining, randomized computing, computer simulations, and combinatorial optimization.

Lance Fiondella is an assistant professor in the Department of Electrical and Computer Engineering at the University of Massachusetts Dartmouth. He received a Ph.D. in computer science and engineering from the University of Connecticut. His research interests include algorithms, reliability engineering, and risk analysis.

Mohamed Ahmed is a program manager at Microsoft Windows Azure Mobile. He received a PhD in computer science and engineering from the University of Connecticut. His research interests include multi/many-cores technologies, high-performance computing, parallel programming, cloud computing, and GPU programming.

Reda A. Ammar is a professor and the head of the Department of Computer Science and Engineering at the University of Connecticut. He received a PhD in computer science from the University of Connecticut. He is the president of the International Society of Computers and Their Applications and editor-in-chief of the International Journal on Computers and Their Applications. His primary research interests encompass distributed and high-performance computing and real-time systems.

About the Series

Chapman & Hall/CRC Computer and Information Science Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Programming Languages / General
MATHEMATICS / Arithmetic