Distributed System Design: 1st Edition (Hardback) book cover

Distributed System Design

1st Edition

By Jie Wu

CRC Press

496 pages

Purchasing Options:$ = USD
Hardback: 9780849331787
pub: 1998-08-06
SAVE ~$35.00
eBook (VitalSource) : 9781315141411
pub: 2017-12-14
from $28.98

FREE Standard Shipping!


Future requirements for computing speed, system reliability, and cost-effectiveness entail the development of alternative computers to replace the traditional von Neumann organization. As computing networks come into being, one of the latest dreams is now possible - distributed computing.

Distributed computing brings transparent access to as much computer power and data as the user needs for accomplishing any given task - simultaneously achieving high performance and reliability.

The subject of distributed computing is diverse, and many researchers are investigating various issues concerning the structure of hardware and the design of distributed software. Distributed System Design defines a distributed system as one that looks to its users like an ordinary system, but runs on a set of autonomous processing elements (PEs) where each PE has a separate physical memory space and the message transmission delay is not negligible. With close cooperation among these PEs, the system supports an arbitrary number of processes and dynamic extensions.

Distributed System Design outlines the main motivations for building a distributed system, including:

  • inherently distributed applications

  • performance/cost

  • resource sharing

  • flexibility and extendibility

  • availability and fault tolerance

  • scalability

    Presenting basic concepts, problems, and possible solutions, this reference serves graduate students in distributed system design as well as computer professionals analyzing and designing distributed/open/parallel systems.

    Chapters discuss:

  • the scope of distributed computing systems

  • general distributed programming languages and a CSP-like distributed control description language (DCDL)

  • expressing parallelism, interprocess communication and synchronization, and fault-tolerant design

  • two approaches describing a distributed system: the time-space view and the interleaving view

  • mutual exclusion and related issues, including election, bidding, and self-stabilization

  • prevention and detection of deadlock

  • reliability, safety, and security as well as various methods of handling node, communication, Byzantine, and software faults

  • efficient interprocessor communication mechanisms as well as these mechanisms without specific constraints, such as adaptiveness, deadlock-freedom, and fault-tolerance

  • virtual channels and virtual networks

  • load distribution problems

  • synchronization of access to shared data while supporting a high degree of concurrency
  • Table of Contents



    Basic Computer Organizations

    Definition of a Distributed System

    Our Model

    Interconnection Networks

    Applications and Standards


    Source of References

    Distributed Programming Languages

    Requirement for Distributed Programming Support

    An Overview of Parallel/Distributed Programming Language

    Expressing Parallelism

    Process Communication and Synchronization

    Remote Procedure Calls


    Formal Approaches to Distributed Systems Design

    Introduction to Models

    Causally Related Events

    Global States

    Logical Clocks


    Classification of Distributed Control Algorithms

    The Complexity of Distributed Algorithms

    Mutual Exclusion and Election Algorithms

    Mutual Exclusion

    Non-Token-Based Solutions

    Token-Based Solutions




    Prevention, Avoidance, and Detection of Deadlock

    The Deadlock Problem

    Deadlock Prevention

    A Deadlock Prevention Example: Distributed Database Systems

    Deadlock Avoidance

    A Deadlock Prevention Example: Multi-Robot Flexible Assembly Cells

    Deadlock Detection and Recovery

    Deadlock Detection and Recovery Examples

    Distributed Routing Algorithms


    General-Purpose Shortest Path Routing

    Unicasting in Special-Purpose Networks

    Broadcasting in Special-Purpose Networks

    Multicasting in Special-Purpose Networks

    Adaptive Deadlock-Free, and Fault-Tolerant Routing

    Virtual Channels and Virtual Networks

    Fully Adaptive and Deadlock-Free Routing

    Partially Adaptive and Deadlock-Free Routing

    Fault-Tolerant Unicasting: General Approaches

    Fault-Tolerant Unicasting in 2-D Meshes and Tori

    Fault-Tolerant Unicasting in Hypercubes

    Fault-Tolerant Broadcasting

    Fault-Tolerant Multicasting

    Reliability in Distributed Systems

    Basic Models

    Building Blocks of Fault-Tolerant Design

    Handling of Node Faults

    Issues in Backward Recovery

    Handling of Byzantine Faults

    Handling of Communication Faults

    Handling of Storage Faults

    Static Load Distribution

    Classification of Load Distribution

    Static Load Distribution

    An Overview of Different Scheduling Models

    Task Scheduling Based on Task Precedent Graphs

    Case Study: Two Optimal Scheduling Algorithms

    Task Scheduling Based on Task Interaction Graphs

    Case Study: Domain Partition

    Scheduling Using Other Models and Objectives

    Future Directions

    Dynamic Load Distribution

    Dynamic Load Distribution

    Load Balancing Design Decisions

    Migration Policies: Sender-Initiated and Receiver-Initiated

    Parameters Used for Load Balancing

    Other Relevant Issues

    Sample Load Balancing Algorithms

    Case Study: Load Balancing on Hypercube Multicomputers

    Future Directions

    Distributed Data Management

    Basic Concepts

    Serializability Theory

    Concurrency Control

    Replica and Consistency Management

    Distributed Reliability Protocols

    Distributed System Applications

    Distributed Operating Systems

    Distributed File Systems

    Distributed Shared Memory

    Distributed Database Systems

    Heterogeneous Processing

    Future Directions of Distributed Systems

    Subject Categories

    BISAC Subject Codes/Headings:
    COMPUTERS / General
    COMPUTERS / Software Development & Engineering / General