Image and Video Compression for Multimedia Engineering: Fundamentals, Algorithms, and Standards, Third Edition, 3rd Edition (Hardback) book cover

Image and Video Compression for Multimedia Engineering

Fundamentals, Algorithms, and Standards, Third Edition, 3rd Edition

By Yun-Qing Shi, Huifang Sun

CRC Press

634 pages | 280 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781138299597
pub: 2019-03-25
SAVE ~$19.49
Available for pre-order

FREE Standard Shipping!


The latest edition provides a comprehensive foundation for image and video compression. It covers HEVC/H.265 and future video coding activities, in addition to Internet Video Coding. The book features updated chapters and content, along with several new chapters and sections. It adheres to the current international standards, including the JPEG standard.

Table of Contents


1.1 Practical Needs for Image and Video Compression

1.2 Feasibility of Image and Video Compression

1.2.1 Statistical Redundancy

1.2.2 Psychovisual Redundancy

1.3 Visual Quality Measurement

1.3.1 Subjective Quality Measurement

1.3.2 Objective Quality Measurement

1.4 Information Theory Results

1.4.1 Entropy

1.4.2 Shannon’s Noiseless Source Coding Theorem

1.4.3 Shannon’s Noisy Channel Coding Theorem

1.4.4 Shannon’s Source Coding Theorem

1.4.5 Information Transmission Theorem

1.5 Summary

1.6 Exercises

1.7 References


2.1 Quantization and the Source Encoder

2.2 Uniform Quantization

2.2.1 Basics

2.2.2 Optimum Uniform Quantizer

2.3 Nonuniform Quantization

2.3.1 Optimum (Nonuniform) Quantization

2.3.2 Companding Quatization

2.4 Adaptive Quantization

2.4.1 Forward Adaptive Quantization

2.4.2 Backward Adaptive Quantization

2.4.3 Adaptive Quantization with a One-Word Memory

2.4.4 Switched Quantization

2.5 PCM

2.6 Summary

2.7 Exercises

2.8 References


3.1 Introduction to DPCM

3.1.1 Simple Pixel-to-Pixel DPCM

3.1.2 General DPCM Systems

3.2 Optimum Linear Prediction

3.2.1 Formulation

3.2.2 Orthogonality Condition and Minimum Mean Square Error

3.2.3 Solution to Yule-Walker Equations

3.3 Some Issues in the Implementation of DPCM

3.3.1 Optimum DPCM System

3.3.2 1-D, 2-D and 3-D DPCM

3.3.3 Order of Predictor

3.3.4 Adaptive Prediction

3.3.5 Effect of Transmission Errors

3.4 Delta Modulation

3.5 Interframe Differential Coding

3.5.1 Conditional Replenishment

3.5.2 3-D DPCM

3.5.3 Motion Compensated Predictive Coding

3.6 Information Preserving Differential Coding

3.7 Summary

3.8 Exercises

3.9 References


4.1 Introduction

4.1.1 Hotelling Transform

4.1.2 Statistical Interpretation

4.1.3 Geometrical Interpretation

4.1.4 Basis Vector Interpretation

4.1.5 Procedures of Transform Coding

4.2 Linear Transforms

4.2.1 2-D Image Transformation Kernel

4.2.2 Basis Image Interpretation

4.2.3 Subimage Size Selection

4.3 Transforms of Particular Interest

4.3.1 Discrete Fourier Transform (DFT)

4.3.2 Discrete Walsh Transform (DWT)

4.3.3 Discrete Hadamard Transform (DHT)

4.3.4 Discrete Cosine Transform (DCT)

4.3.5 Performance Comparison

4.4 Bit Allocation

4.4.1 Zonal Coding

4.4.2 Threshold Coding

4.5 Some Issues

4.5.1 Effect of Transmission Error

4.5.2 Reconstruction Error Sources

4.5.3 Comparison Between DPCM and TC

4.5.4 Hybrid Coding

4.6 Summary

4.7 Exercises

4.8 References


5.1 Some Fundamental Results

5.1.1 Coding an Information Source

5.1.2 Some Desired Characteristics

5.1.3 Discrete Memoryless Sources

5.1.4 Extensions of a Discrete Memoryless Source

5.2 Huffman Codes

5.2.1 Required Rules for Optimum Instantaneous Codes

5.2.2 Huffman Coding Algorithm

5.3 Modified Huffman Codes

5.3.1 Motivation

5.3.2 Algorithm

5.3.3 Codebook Memory Requirement

5.3.4 Bounds on Average Codeword Length

5.4 Arithmetic Codes

5.4.1 Limitations of Huffman Coding

5.4.2 The Principle of Arithmetic Coding

5.4.3 Implementation Issues

5.4.4 History

5.4.5 Applications

5.5 Summary

5.6 Exercises

5.7 References


6.1 Markov Source Model

6.1.1 Discrete Markov Source

6.1.2 Extensions of a Discrete Markov Source

6.1.3 Autoregressive (AR) Model

6.2 Run-Length Coding (RLC)

6.2.1 1-D Run-Length Coding

6.2.2 2-D Run-Length Coding

6.2.3 Effect of Transmission Error and Uncompressed Mode

6.3 Digital Facsimile Coding Standards

6.4 Dictionary Coding

6.4.1 Formulation of Dictionary Coding

6.4.2 Categorization of Dictionary-Based Coding Techniques

6.4.3 Parsing Strategy

6.4.4 Sliding Window (LZ77) Algorithms

6.4.5 LZ78 Algorithms

6.5 International Standards for Lossless Still Image Compression

6.5.1 Lossless Bilevel Still Image Compression

6.5.2 Lossless Multilevel Still Image Compression

6.6 Summary

6.7 Exercises

6.8 References

CHAPTER 7 Some Material Related to Multimedia Engineering

7.1 Digital Watermarking

7.1.1 Where to embed digital watermark

7.1.2 Watermark signal with one random binary sequence

7.1.3 Challenge faced by digital watermarking

7.1.4 Watermark embedded into the DC component

7.1.5 Digital watermark with multiple information bits and error correction coding

7.1.6 Conclusion

7.2 Reversible Data Hiding

7.3 Information Forensics



8.1 Introduction

8.2 Sequential DCT-based Encoding Algorithm

8.3 Progressive DCT-based Encoding Algorithm

8.4 Lossless Coding Mode

8.5 Hierarchical Coding Mode

8.6 Summary

8.7 Exercises

8.8 References


9.1 A Review of Wavelet Transform

9.1.1 Definition and Comparison with Short-time Fourier Transform

9.1.2 Discrete Wavelet Transform

9.1.3 Lifting Scheme Three Steps in Forward Wavelet Transform Inverse Wavelet Transform Lifting Version of CDF (2,2) A Numerical Example (5,3) Integer Wavelet Transform A Demonstration Example of (5,3) Integer Wavelet Transform Sumamry

9.2 Digital Wavelet Transform for Image Compression

9.2.1 Basic Concept of Image Wavelet Transform Coding

9.2.2 Embedded Image Wavelet Transform Coding Algorithms Early Wavelet Image Coding Algortihms and Their Drawbacks Modern Wavelet Image Coding Embedded Zero-Tree Wavelet (EZW) Coding Set Partitioning In Hierarchical Trees (SPIHT) Coding

9.3 Wavelet transform for JPEG-2000

9.3.1 Introduction of JPEG-2000 Requirements of JPEG-2000 Parts of JPEG-2000

9.3.2 Verification Model of JPEG2000

9.3.3 An Example of Performance Comparison between JPEG and JPEG-2000

9.4 Summary

9.5 Exercises

9.6 References


10.1 Introduction

10.2 Vector Quantization

10.2.1 Basic Principle of Vector Quantization

10.2.2 Several image coding schemes with vector quantization Residual VQ Classified VQ Transformed domain VQ Predictive VQ Block Trucation Coding

10.2.3 Lattice VQ for Image Coding

10.3 Fractal Image Coding

10.3.1 Mathematical Foundation

10.3.2 IFS-based Fractal Image Coding

10.3.3 Other Fractal Image Coding Methods

10.4 Model-based Coding

10.4.1 Basic Concept

10.4.2 Image Modeling

10.5 Summary

10.6 Exercises

10.7 References


11.1 Image Sequences

11.2 Interframe Correlation

11.3 Frame Replenishment

11.4 Motion Compensated Coding

11.5 Motion Analysis

11.5.1 Biological Vision Perspective

11.5.2 Computer Vision Perspective

11.5.3 Signal Processing Perspective

11.6 Motion Compensation for Image Sequence Processing

11.6.1 Motion Compensated Interpolation

11.6.2 Motion Compensated Enhancement

11.6.3 Motion Compensated Restoration

11.6.4 Motion Compensated Down-conversion

11.7 Summary

11.8 Exercises

11.9 References



12.1 Non-overlapped, Equally Spaced, Fixed Size, Small Rectangular Block Matching

12.2 Matching Criteria

12.3 Searching Procedures

12.3.1 Full Search

12.3.2 2-D Logarithm Search

12.3.3 Coarse-fine Three-step Search

12.3.4 Conjugate Direction Search

12.3.5 Subsampling in the Correlation Window

12.3.6 Multiresolution Block Matching

12.3.7 Thresholding Multiresolution Block Matching

12.4 Matching Accuracy

12.5 Limitations with Block Matching Techniques

12.6 New Improvements

12.6.1 Hierarchical Block Matching

12.6.2 Multigrid Block Matching

12.6.3 Predictive Motion Field Segmentation

12.6.4 Overlapped Block Matching

12.7 Summary

12.8 Exercises

12.9 References




13.1 Problem Formulation

13.2 Descent Methods

13.2.1 First Order Necessary Conditions

13.2.2 Second Order Sufficient Conditions

13.2.3 Underlying Strategy

13.2.4 Convergence Speed

13.2.5 Steepest Descent Method

13.2.6 Newton-Raphson’s Method

13.2.7 Other methods

13.3 Netravali-Robbins’ Pel Recursive Algorithm

13.3.1 Inclusion of a neighborhood area

13.3.2 Interpolation

13.3.3 Simplification

13.3.4 Performance

13.4 Other Pel Recursive Algorithms

13.4.1 Bergmann’s algorithm (1982)

13.4.2 Bergmann’s algorithm (1984)

13.4.3 Cafforio and Rocca’s Algorithm

13.4.4 Walker and Rao’s algorithm

13.5 Performance Comparison

13.6 Summary

13.7 Exercises

13.8 References


14.1 Fundamentals

14.1.1 2-D Motion and Optical Flow

14.1.2 Aperture Problem

14.1.3 Ill-posed Problem

14.1.4 Classification of Optical Flow Techniques

14.2 Gradient-based Approach

14.2.1 Horn and Schunck's Method

14.2.2 Modified Horn and Schunck Method

14.2.3 Lucas and Kanade’s Method

14.2.4 Nagel's Method

14.2.5 Uras, Girosi, Verri and Torre’s Method

14.3 Correlation-based Approach

14.3.1 Anandan’s Method

14.3.2 Singh's Method

14.3.3 Pan, Shi and Shu's Method

14.4 Multiple Attributes for Conservation Information

14.4.1 Weng, Ahuja and Huang’s Method

14.4.2 Xia and Shi’s Method

14.5 Summary

14.6 Exercises

14.7 References

CHAPTER 15 Further Discussion and Summary on 2-D motion EstiAtion

15.1 General Characterization

15.1.1 Aperture Problem

15.1.2 Ill-posed Inverse Problem

15.1.3 Conservation Information and Neighborhood Information

15.1.4 Occlusion and Disocclusion

15.1.5 Rigid and Nonrigid Motion

15.2 Different Classifications

15.2.1 Deterministic Methods vs. Stochastic Methods

15.2.2 Spatial Domain Methods vs. Frequency Domain Methods

15.2.3 Region-based Approaches vs. Gradient-based Approaches

15.2.4 Forward vs. Backward Motion Estimation

15.3 Performance Comparison Between Three Major Approaches

15.3.1 Three Representatives

15.3.2 Algorithm Parameters

15.3.3 Experimental Results and Observations

15.4 New Trends

15.4.1 DCT-Based Motion Estimation

15.5 Summary

15.6 Exercises

15.7 References


CHAPTER 16 Fundamentals of digital video coding

16.1 Digital Video Representation

16.2 Information Theory Results: Rate Distortion Function of Video Signal

16.3 Digital Video Formats

16.3.1 Digital Video Color Systems

16.3.2 Progressive and Interlaced video signals

16.3.3 Video formats used by video industry

16.4 Current Status of Digital Video/Image Coding Standards

16.5 Summary

16.6 Exercises

16.7 Reference

CHAPTER 17 Digital video coding standards - MPEG-1/2 Video

17.1 Introduction

17.2 Features of MPEG-1/2 Video Coding

17.2.1 MPEG-1 Features Introduction Layered Structure Based on Group of Pictures Encoder Structure Structure of the Compressed Bitstream Decoding Process

17.2.2 MPEG-2 Enhancement Field/frame Prediction Mode Field/frame DCT Coding Syntax Downloadable Quantization Matrix and Alternative Scan Order Pan and Scan Concealment Motion Vectors Scalability

17.3 MPEG-2 Video Encoding

17.3.1 Introduction

17.3.2 Pre-processing

17.3.3 Motion Estimation and Motion Compensation

17.4 Rate Control

17.4.1 Introduction of Rate Control

17.4.2 Rate Control of Test Model 5 (TM5) for MPEG-2

17.5 Optimum Mode Decision

17.5.1 Problem Formation

17.5.2 Procedure for Obtaining the Optimal Mode

17.5.3 Practical Solution with New Criteria for the Selection of Coding Mode

17.6 Statistical Multiplexing Operations on Multiple Program Encoding

17.6.1 Background of Statistical Multiplexing Operation

17.6.2 VBR Encoders in StatMux

17.6.3 Research Topics of StatMux

17.7 Summary

17.8 Exercises

17.9 References

CHAPTER 18 Application issues of MPEG-1/2 video coding

18.1 Introduction

18.2 ATSC DTV Standards

18.2.1 A Brief History

18.2.2 Technical Overview of ATSC Systems Picture Layer Copression Layer Transport Layer

18.3 Transcoding with Bitstream Scaling

18.3.1 Background

18.3.2 Basic Principles of Bitstream Scaling

18.3.3 Architectures of Bitstream Scaling Architecture 1: Cutting AC Coefficients Architecture 2: Increasing Quantization Step Architecture 3: Re-encoding with Old Motion Vectors and Old Decisions Architecture 4: Re-encoding with Old Motion Vectors and New Decisions Comparison of Bistream Scaling Methods

18.3.4 MPEG-2 to MPEG-4 transcoding

18.4 Down Conversion Decoder

18.4.1 Background

18.4.2 Frequency Synthesis Down-conversion

18.4.3 Low-resolution Motion Compensation

18.4.4 Three-layer Scalable Decoder

18.4.5 Summary of Down Conversion Decoder

18.5 Error Concealment

18.5.1 Background

18.5.2 Error Concealment Algorithms

18.5.3 Algorithm Enhancements

18.5.4 Summary of Error Concealment

18.6 Summary

18.7 Exercises

18.8 References


19.1 Introduction

19.2 MPEG-4 Requirements and Functionalities

19.2.1 Content-based Interactivity

19.2.2 Content-based Efficient Compression Improved Coding Efficiency Coding of Multiple Concurrent Data Streams

19.2.3 Universal Access Robustness in Error-prone Environments Content-based Scalability

19.2.4 Summary of MPEG-4 features

19.3 Technical Description of MPEG-4 Video

19.3.1 Overview of MPEG-4 Video

19.3.2 Motion Estimation and Compensation Adaptive Selection of 16x16 Block or Four 8x8 Blocks Overlapped Motion Compensation

19.3.3 Texture Coding INTRA DC and AC prediction Motion Estimation/Compensation of Arbitrary Shaped VOP Texture coding of arbitrary shaped VOP

19.3.4 Shape Coding Binary shape coding with CAE algorithm Gray Scale Shape coding

19.3.5 Sprite Coding

19.3.6 Interlaced Video Coding

19.3.7 Wavelet-based Texture Coding Decomposition of the Texture Information Quantization of Wavelet Coefficients Coding of Wavelet Coefficients of Low-Low Band and Other Bands Adaptive Arithmetic Coder

19.3.8 Generalized Spatial and Temporal Scalability

19.3.9 Error Resilience

19.4 MPEG-4 Visual Bitstream Syntax and Semantics

19.5 MPEG-4 Visual Profiles and Levels

19.6 MPEG-4 Video Verification Model

19.6.1 VOP-based Encoding and Decoding Process

19.6.2 Video Encoder

19.6.3 Video Decoder

19.7 Summary

19.8 Exercises

19.9 Reference



20.1 Introduction

20.2 H.261 Video Coding Standard

20.2.1 Overview of H.261 Video Coding Standard

20.2.2 Technical Detail of H.261

20.2.3 Syntax Description

20.3 H.263 Video Coding Standard

20.3.1 Overview of H.263 Video Coding

20.3.2 Technical Features of H.263

19.4 H.263 Video Coding Standard Version 2

19.4.1 Overview of H.263 Version 2

19.4.2 New Features of H.263 Version 2

20.5 H.263++ Video Coding and H.26L

20.6 Summary

20.7 Exercises

20.8 References

CHAPTER 21 Video Coding STANDARD – H.264/AVC

21.1 Introduction

21.2 Overview of H.264/AVC codec structure

21.3 Technical description of H.264/AVC coding tools

21.3.1 Instantaneous Decoding Refresh (IDR) Picture

21.3.2 Switching I (SI)-slices and Switching P (SP)-slices

21.3.3 Transform and quantization

21.3.4 Intra frame coding with directional spatial prediction

21.3.5 Adaptive block size motion compensation

21.3.6 Motion compensation with multiple references

21.3.7 Entropy coding

21.3.8 Loop Filter

21.3.9 Error resilience tools

21.4 Profiles and levels of H.264/AVC

21.4.1 Profiles of H.264/AVC

21.4.2 Levels of H.264/AVC

21.5 Summary

21.6 Exercises

21.7 References

CHAPTER 22 New Video Coding Standard - HEVC/H.265

22.1 Introduction

22.2 Overview of HEVC/H.265 codec structure

22.3 Technical description of HEVC/H.265 coding tools

22.3.1 Video coding block structure

22.3.2 Predictive coding structure

22.3.3 Transform and quantization

22.3.4 Loop filters

22.3.5 Entropy coding

22.3.6 Parallel processing tools

22.4 HEVC/H.265 profiles and range extensions

22.4.1 Version 1 of HEVC/H.265

22.4.2 Version 2 of HEVC/H.265

22.4.3 Version 3 and 4 of HEVC/H.265

22.5 Performance comparison with H.264/AVC

22.5.1 Technical differences between H.264/AVC and HEVC/H.265

22.5.2 Performance comparison between H.264/AVC and HEVC/H.265

22.6 Summary

22.7 Exercises

21.8 References

CHAPTER 23 Internet Video Coding Standard - IVC

23.1 Introduction

23.2 Coding structure of IVC standard

23.2.1 Adaptive transform

23.2.2 Intra prediction

23.2.3 Inter prediction

23.2.4 Motion vector prediction

23.2.5 Sub-pel interpolation

23.2.6 Reference frames

23.2.7 Entropy coding

23.2.8 Loop filtering

23.3 Performance evaluation

23.4 Summary

23.5 Exercises

23.6 References

CHAPTER 24 MPEG System - Video, Audio and Data Multiplexing

24.1 Introduction

24.2 MPEG-2 System

24.2.1 Major Technical Definitions in MPEG-2 System Document

24.2.2 Transport Streams

24.2.3 Transport Streams Splicing

24.2.4 Program Streams

24.2.5 Timing Model and Synchronization

24.3 MPEG-4 System

24.3.1 Overview and Architecture

24.3.2 Systems Decoder Model

24.3.3 Scene Description

24.3.4 Object description framework

24.4 MPEG media transport (MMT)

24.4.1 Overview

24.4.2 MMT Content Model

24.4.3 Encapsulation of MPU

24.4.4 Packetized delivery of package

24.4.5 Cross layer interface

24.4.6 Signaling

24.4.7 Hypothetical receiver buffer model

24.5 Dynamic Adaptive Streaming over HTTP (DASH)

24.4.1 Introduction

24.4.2 Media presentation description (MPD)

24.4.3 segment format

24.6 Summary

24.7 Exercises

24.8 References


About the Authors/Editor

Huifang Sun received the B.S. degree in Electrical Engineering from Harbin Engineering Institute (Harbin Engineering University now), Harbin, China in 1967, and the Ph.D. degree in Electrical Engineering from University of Ottawa, Ottawa, Canada. In 1986 he jointed Fairleigh Dickinson University, Teaneck, New Jersey, as an Assistant Professor and promoted to an Associate Professor in 1990. From 1990 to 1995 he was with the David Sarnoff Research Center (Sarnoff Corp), Princeton, New Jersey, as a member of technical staff and later promoted to Technology Leader of Digital Video Technology. He joined Mitsubishi Electric Research Laboratories (MERL), in 1995 as a Senior Principal Technical Staff and was promoted as Deputy Director in 1997, Vice President and MERL Fellow in 2003 and now as MERL Fellow. He holds 66 U.S. patents and has authored or co-authored 2 books as well as more than 150 journal and conference papers. For his contributions on HDTV development he obtained 1994 Sarnoff technical achievement award. He also obtained the best paper award of IEEE Transactions on Consumer Electronics in 1993, the best paper award of International Conference on Consumer Electronics in 1997 and the best paper award of IEEE Transaction on CSVT in 2003. He was an Associate Editor for IEEE Transaction on Circuits and Systems for Video Technology and the Chair ofVisual Processing Technical Committeeof IEEE Circuits and System Society. He is an IEEE Life Fellow. He also served as a guest professor of Peking University, Tianjin University, Shanghai Jiaotong University (Guest Researcher) and several other universities in China.

Dr. Yun Qing Shi has been a professor with the Department of Electrical and Computer Engineering at the New Jersey Institute of Technology (NJIT), Newark, NJ since 1987. He has authored and co-authored more than 300 papers in his research areas, a book on Image and Video Compression, three book chapters on Image Data Hiding, one book chapter on Steganalysis, and one book chapter on Digital Image Processing. He has edited more than 10 proceedings of international workshops and conferences, holds 29 awarded US patents, and delivered more than 120 invited talks around the world. He is a member of IEEE Circuits and Systems Society (CASS)'s Technical Committee of Visual Signal Processing and Communications, Technical Committee of Multimedia Systems and Applications, an associate editor of IEEE Transactions on Information Forensics and Security, and a fellow of IEEE for his contribution to Multidimensional Signal Processing since 2005.

About the Series

Image Processing Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Computer Graphics