1st Edition

Parallel Computing Architectures and APIs IoT Big Data Stream Processing

By Vivek Kale Copyright 2020
    406 Pages 65 B/W Illustrations
    by Chapman & Hall

    404 Pages 65 B/W Illustrations
    by Chapman & Hall

    Parallel Computing Architectures and APIs: IoT Big Data Stream Processing commences from the point high-performance uniprocessors were becoming increasingly complex, expensive, and power-hungry. A basic trade-off exists between the use of one or a small number of such complex processors, at one extreme, and a moderate to very large number of simpler processors, at the other. When combined with a high-bandwidth, interprocessor communication facility leads to significant simplification of the design process. However, two major roadblocks prevent the widespread adoption of such moderately to massively parallel architectures: the interprocessor communication bottleneck, and the difficulty and high cost of algorithm/software development.

    One of the most important reasons for studying parallel computing architectures is to learn how to extract the best performance from parallel systems. Specifically, you must understand its architectures so that you will be able to exploit those architectures during programming via the standardized APIs.

    This book would be useful for analysts, designers and developers of high-throughput computing systems essential for big data stream processing emanating from IoT-driven cyber-physical systems (CPS).

    This pragmatic book:

    • Devolves uniprocessors in terms of a ladder of abstractions to ascertain (say) performance characteristics at a particular level of abstraction
    • Explains limitations of uniprocessor high performance because of Moore’s Law
    • Introduces basics of processors, networks and distributed systems
    • Explains characteristics of parallel systems, parallel computing models and parallel algorithms
    • Explains the three primary categorical representatives of parallel computing architectures, namely, shared memory, message passing and stream processing
    • Introduces the three primary categorical representatives of parallel programming APIs, namely, OpenMP, MPI and CUDA
    • Provides an overview of Internet of Things (IoT), wireless sensor networks (WSN), sensor data processing, Big Data and stream processing
    • Provides introduction to 5G communications, Edge and Fog computing

    Parallel Computing Architectures and APIs: IoT Big Data Stream Processing discusses stream processing that enables the gathering, processing and analysis of high-volume, heterogeneous, continuous Internet of Things (IoT) big data streams, to extract insights and actionable results in real time. Application domains requiring data stream management include military, homeland security, sensor networks, financial applications, network management, web site performance tracking, real-time credit card fraud detection, etc.

    1 Uniprocessor Computers     
    1.1 Type of Computers
    1.2 Computer System
    1.3 Hardware and software logical equivalence
    1.4 Stack of Abstraction
    1.5 Application Programming Interfaces (APIs)
    1.6 Summary

    2 Processor Physics and Moore’s Law       
    2.1 Speed of processing and Power Problem
    2.2 Area, Delay and Power Consumption
    2.3 Area, Latency and Power tradeoffs
    2.4 Moore’s Law
    2.5 Performance Wall
    2.6 Summary


    Section I Genesis of Parallel Computing    

    3 Processor Basics       
    3.1 Processor
    3.2 Aspects of processor performance
    3.3 Enhancing uniprocessor performance
    3.4 Summary

    4 Networking Basics        
    4.1 Network Principles
    4.2 Types of Networks
    4.3 Network Models
    4.4 Interconnection Networks
    4.4.1 Ethernet
    4.4.2 Switches
    4.5 Summary

    5 Distributed Systems Basics        
    5.1 Distributed Systems
    5.2 Distributed system benefits
    5.3 Distributed Computation Systems
    5.4 Summary


    Section II Road to Parallel Computing    
    6 Parallel Systems        
    6.1 Flynn’s taxonomy for parallel computer architectures
    6.2 Types of parallel computers
    6.3 Characteristics of parallel systems
    6.5 Summary

    7 Parallel Computing Models     
    7.1 Shared Memory Models
    7.2 Interconnection Network Models
    7.3 Dataflow Model
    7.4 Summary

    8 Parallel Algorithms      
    8.1 Classes of Problems solvable through parallelization
    8.2 Types of Parallelization
    8.3 Granularity of Parallelization
    8.4 Assigning computational tasks to processors
    8.5 Illustrating design of a parallel algorithm
    8.6 Parallel Algorithms for Conventional Computations
    8.6.1 Parallel Prefix and Suffix Computations on a
    Linked List
    8.7 Parallel Algorithms for Unconventional Computations
    8.8 Summary


    Section III Parallel Computing Architectures   

    9 Parallel Computing Architecture Basics   
    9.1 High Performance Distributed Computing
    9.2 Performance evaluation
    9.3 Application and Architecture
    9.4 Maximum performance computing approach
    9.5 Parallel computing basics
    9.6 Parallel computing paradigms
    9.7 Summary

    10 Shared-memory Architecture     
    10.1 Shared memory paradigm
    10.2 Cache
    10.3 Write policy
    10.4 Cache coherency
    10.5 Memory consistency
    10.6 Summary

    11 Messaging Passing Architecture    
    11.1 Message passing paradigm
    11.2 Routing
    11.3 Switching
    11.4 Summary

    12 Stream Processing Architecture    
    12.1 Data Flow Paradigm
    12.2 Parallel Accelerators
    12.3 Stream Processors
    12.4 Summary


    Section IV Parallel Computing APIs     

    13 Parallel Computing Programming Basics   
    13.1 Shared Memory Programming
    13.2 Message Passing Programming
    13.3 Stream Programming
    13.4 Summary
    Appendix 13A Functional Programming   
    Appendix 13B MapReduce     

    14 Shared-memory Parallel Programming with OpenMP 
    14.1 OpenMP
    14.2 Overview of features
    14.3 Additional feature details
    14.4 Summary

    15 Message Passing Parallel Programming with MPI
    15.1 Introduction to MPI
    15.2 Basic point-to-point communication routines
    15.3 Basic MPI collective communication routines
    15.4 Environment management routines
    15.5 Point to point communication routines
    15.6 Collective communication routines
    15.7 Summary

    16 Stream Processing Programming with CUDA, OpenCL 20
      and OpenACC
    16.1 CUDA
    16.2 OpenCL
    16.3 OpenACC
    16.4 Summary


    Section V IoT Big Data Stream Processing    

    17 Internet of Things Computing      
    17.1 Introduction to Internet of Things
    17.2 RFID (Radio Frequency Identification)
    17.3 Sensor Networks
    17.4 Summary
    Appendix 17A Internet of Things (IoT) in 5G Mobile    
    Technologies
    Appendix 17B Edge and Fog Computing    

    18 Sensor Data Processing      
    18.1 Sensor Data-Gathering and Data-Dissemination
    Mechanisms
    18.2 Time Windows
    18.3 Sensor Database
    18.4 Data-Fusion Mechanisms
    18.5 Data Fusion Techniques, Methods, and Algorithms
    18.6 Data Fusion Architectures and Models
    18.7 Summary
    Appendix 18A Wireless Sensor Networks (WAN)         
    Anomalies    

    19 Big Data Computing      
    19.1 Introduction to Big Data
    19.2 Tools, Techniques and Technologies of Big Data
    19.3 NoSQL Databases
    19.4 Aadhaar Project
    19.5 Summary
    Appendix 19A Compute-intensive Big Compute   
    versus data-intensive Big Data

    20 Stream Processing      
    20.1 Big Data Stream Processing
    20.2 Stream Processing System Implementations

    1. TelegraphCQ
    2. STREAM
    3. Aurora
    4. Borealis
    5. IBM SYSTEM S AND IBM SPADE

    1. Apache Storm
    2. Yahoo! S4
    3. Apache Samza
    4. Apache Streaming

    20.3 Summary
    Appendix 20A Spark      


    Epilogue: Quantum Computing      

    Bibliography

    Index

    Biography

    Vivek Kale has more than two decades of professional IT experience during which he has handled and consulted on various aspects of enterprise-wide information modeling, enterprise architectures, business process re-design, and, e-business architectures. He has been Group CIO of Essar Group, the steel/oil & gas major of India, as well as, Raymond Ltd., the textile & apparel major of India. He is a seasoned practitioner in enhancing business agility through digital transformation of business models, enterprise architecture and business processes, and, enhancing IT-enabled enterprise intelligence (EQ). He has authored books on Cloud Computing and Big Data Computing. He is also author of Big Data Computing: A Guide for Business and Technology Managers(CRC Press, 2016), Agile Network Businesses: Collaboration, Coordination, and Competitive Advantage (CRC Press 2017), and, Digital Transformation of Enterprise Architecture (CRC Press 2020).