1st Edition

Multicore Computing
Algorithms, Architectures, and Applications




ISBN 9781439854341
Published December 12, 2013 by Chapman and Hall/CRC
452 Pages 231 B/W Illustrations

USD $200.00

Prices & shipping based on shipping country


Preview

Book Description

Every area of science and engineering today has to process voluminous data sets. Using exact, or even approximate, algorithms to solve intractable problems in critical areas, such as computational biology, takes time that is exponential in some of the underlying parameters. Parallel computing addresses this issue and has become affordable with the advent of multicore architectures. However, programming multicore machines is much more difficult due to oddities existing in the architectures.

Offering insights into different facets of this area, Multicore Computing: Algorithms, Architectures, and Applications focuses on the architectures, algorithms, and applications of multicore computing. It will help readers understand the intricacies of these architectures and prepare them to design efficient multicore algorithms.

Contributors at the forefront of the field cover the memory hierarchy for multicore and manycore processors, the caching strategy Flexible Set Balancing, the main features of the latest SPARC architecture specification, the Cilk and Cilk++ programming languages, the numerical software library Parallel Linear Algebra Software for Multicore Architectures (PLASMA), and the exact multipattern string matching algorithm of Aho-Corasick. They also describe the architecture and programming model of the NVIDIA Tesla GPU, discuss scheduling directed acyclic graphs onto multi/manycore processors, and evaluate design trade-offs among Intel and AMD multicore processors, IBM Cell Broadband Engine, and NVIDIA GPUs. In addition, the book explains how to design algorithms for the Cell Broadband Engine and how to use the backprojection algorithm for generating images from synthetic aperture radar data.

Table of Contents

Memory Hierarchy for Multicore and Manycore Processors
Mohamed Zahran and Bushra Ahsan
Design Issues
Physical Memory
Cache Hierarchy Organization
Cache Hierarchy Sharing
Cache Hierarchy Optimization
Cache Coherence
Support for Memory Consistency Models
Cache Hierarchy in Light of New Technologies
Concluding Remarks

FSB: A Flexible Set Balancing Strategy for Last Level Caches
Mohammad Hammoud, Sangyeun Cho, and Rami Melhem
Introduction
Motivation and Background
Flexible Set Balancing (FSB)
Quantitative Evaluation
Related Work
Conclusions and Future Work

The SPARC Processor Architecture
Simone Secchi, Antonino Tumeo, and Oreste Villa
Introduction
The SPARC Instruction Set Architecture
Memory Access
Synchronization
The NIAGARA Processor Architecture
Core Micro-Architecture
Core Interconnection
Memory Subsystem
Niagara Evolutions

The Cilk and Cilk++ Programming Languages
Hans Vandierendonck
Abstract
Introduction
The Cilk Language
Implementation
Analyzing Parallelism in Cilk Programs
Hyperobjects
Conclusion

Multithreading in the PLASMA Library
Jakub Kurzak, Piotr Luszczek, Asim YarKhan, Mathieu Faverge, Julien Langou, Henricus Bouwmeester, and Jack Dongarra
Introduction
Multithreading in PLASMA
Dynamic Scheduling with QUARK
Parallel Composition
Task Aggregation
Nested Parallelism

Efficient Aho-Corasick String Matching on Emerging Multicore Architectures
Antonino Tumeo, Oreste Villa, Simone Secchi, and Daniel Chavarria-Miranda
Introduction
Related Work
Preliminaries
Algorithm Design
Experimental Results
Conclusions

Sorting on a Graphics Processing Unit (GPU)
Shibdas Bandyopadhyay and Sartaj Sahni
Graphics Processing Units
Sorting Numbers on GPUs
Sorting Records on GPUs

Scheduling DAG Structured Computations
Yinglong Xia and Viktor K. Prasanna
Introduction
Background
Related Work
Lock-Free Collaborative Scheduling
Hierarchical Scheduling with Dynamic Thread Grouping
Conclusion

Evaluating Multicore Processors and Accelerators for Dense Numerical Computations
Seunghwa Kang, Nitin Arora, Aashay Shringarpure, Richard W. Vuduc, and David A. Bader
Introduction
Interarchitectural Design Trade-Offs
Descriptions and Qualitative Analysis of Computational Statistics Kernels
Baseline Architecture-Specific Implementations for the Computational Statistics Kernels
Experimental Results for the Computational Statistics Kernels
Descriptions and Qualitative Analysis of Direct N-Body Kernels
Direct N-Body Implementations
Experimental Results and Discussion for the Direct N-Body Implementations
Conclusions

Sorting on the Cell Broadband Engine
Shibdas Bandyopadhyay, Dolly Sharma, Reda A. Ammar, Sanguthevar Rajasekaran, and Sartaj Sahni
The Cell Broadband Engine
High-level Strategies for Sorting
SPU Vector and Memory Operations
Sorting Numbers
Sorting Records

GPU Matrix Multiplication
Junjie Li, Sanjay Ranka, and Sartaj Sahni
Introduction
GPU Architecture
Programming Model
Occupancy
Single Core Matrix Multiply
Multicore Matrix Multiply
GPU Matrix Multiply
A Comparison

Backprojection Algorithms for Multicore and GPU Architectures
William Chapman, Sanjay Ranka, Sartaj Sahni, Mark Schmalz, Linda Moore, Uttam Majumder, and Bracy Elton
Summary of Backprojection
Partitioning Backprojection for Implementation on a GPU
Single Core Backprojection
GPU Backprojection
Conclusion
Acknowledgments

Index

...
View More

Editor(s)

Biography

Sanguthevar Rajasekaran is the UTC Chair Professor of Computer Science and Engineering and director of the Booth Engineering Center for Advanced Technologies at the University of Connecticut. He received a Ph.D. in computer science from Harvard University. He is a fellow of the IEEE and the AAAS and an elected member of the Connecticut Academy of Science and Engineering. His research interests include bioinformatics, parallel algorithms, data mining, randomized computing, computer simulations, and combinatorial optimization.

Lance Fiondella is an assistant professor in the Department of Electrical and Computer Engineering at the University of Massachusetts Dartmouth. He received a Ph.D. in computer science and engineering from the University of Connecticut. His research interests include algorithms, reliability engineering, and risk analysis.

Mohamed Ahmed is a program manager at Microsoft Windows Azure Mobile. He received a PhD in computer science and engineering from the University of Connecticut. His research interests include multi/many-cores technologies, high-performance computing, parallel programming, cloud computing, and GPU programming.

Reda A. Ammar is a professor and the head of the Department of Computer Science and Engineering at the University of Connecticut. He received a PhD in computer science from the University of Connecticut. He is the president of the International Society of Computers and Their Applications and editor-in-chief of the International Journal on Computers and Their Applications. His primary research interests encompass distributed and high-performance computing and real-time systems.