Next-Generation Sequencing Data Analysis  book cover
1st Edition

Next-Generation Sequencing Data Analysis

ISBN 9781482217889
Published February 24, 2016 by CRC Press
258 Pages 48 B/W Illustrations

FREE Standard Shipping
SAVE $22.00
was $110.00
USD $88.00

Prices & shipping based on shipping country


Book Description

A Practical Guide to the Highly Dynamic Area of Massively Parallel Sequencing

The development of genome and transcriptome sequencing technologies has led to a paradigm shift in life science research and disease diagnosis and prevention. Scientists are now able to see how human diseases and phenotypic changes are connected to DNA mutation, polymorphism, genome structure, and epigenomic abnormality. Next-Generation Sequencing Data Analysis shows how next-generation sequencing (NGS) technologies are applied to transform nearly all aspects of biological research.

The book walks readers through the multiple stages of NGS data generation and analysis in an easy-to-follow fashion. It covers every step in each stage, from the planning stage of experimental design, sample processing, sequencing strategy formulation, the early stage of base calling, reads quality check and data preprocessing to the intermediate stage of mapping reads to a reference genome and normalization to more advanced stages specific to each application. All major applications of NGS are covered, including:

  • RNA-seq: mRNA-seq and small RNA-seq
  • Genotyping and variant discovery through genome re-sequencing
  • De novo genome assembly
  • ChIP-seq to study DNA–protein interaction
  • Methylated DNA sequencing on epigenetic regulation
  • Metagenome analysis through community genome shotgun sequencing

Before detailing the analytic steps for each of these applications, the book presents the ins and outs of the most widely used NGS platforms, with side-by-side comparisons of key technical aspects. This helps practitioners decide which platform to use for a particular project. The book also offers a perspective on the development of DNA sequencing technologies, from Sanger to future-generation sequencing technologies.

The book discusses concepts and principles that underlie each analytic step, along with software tools for implementation. It highlights key features of the tools while omitting tedious details to provide an easy-to-follow guide for practitioners in life sciences, bioinformatics, and biostatistics. In addition, references to detailed descriptions of the tools are given for further reading if needed. The accompanying website for the book provides step-by-step, real-world examples of how to apply the tools covered in the text to research projects. All the tools are freely available to academic users.

Table of Contents

Introduction to Cellular and Molecular Biology
The Cellular System and the Code of Life

The Cellular Challenge
How Cells Meet the Challenge
Molecules in Cells
Intracellular Structures or Spaces
The Cell as a System

DNA Sequence: The Genome Base
The DNA Double Helix and Base Sequence
How DNA Molecules Replicate and Maintain Fidelity
How the Genetic Information Stored in DNA Is Transferred to Protein
The Genomic Landscape
DNA Packaging, Sequence Access, and DNA–Protein Interactions
DNA Sequence Mutation and Polymorphism
Genome Evolution
Epigenome and DNA Methylation
Genome Sequencing and Disease Risk

RNA: The Transcribed Sequence
RNA as the Messenger
The Molecular Structure of RNA
Generation, Processing, and Turnover of RNA as a Messenger
RNA Is More Than a Messenger
The Cellular Transcriptional Landscape

Introduction to Next-Generation Sequencing (NGS) and NGS Data Analysis
NGS Technologies: Ins and Outs

How to Sequence DNA: From First Generation to the Next
A Typical NGS Experimental Workflow
Ins and Outs of Different NGS Platforms
Biases and Other Adverse Factors That May Affect NGS Data Accuracy
Major Applications of NGS

Early-Stage NGS Data Analysis: Common Steps
Base Calling, FASTQ File Format, and Base Quality Score
NGS Data Quality Control and Preprocessing
Reads Mapping
Tertiary Analysis

Computing Needs for NGS Data Management and Analysis
NGS Data Storage, Transfer, and Sharing
Computing Power Required for NGS Data Analysis
Software Needs for NGS Data Analysis
Bioinformatics Skills Required for NGS Data Analysis

Application-Specific NGS Data Analysis
Transcriptomics by RNA-Seq

Principle of RNA-Seq
Experimental Design
RNA-Seq Data Analysis
RNA-Seq as a Discovery Tool

Small RNA Sequencing
Small RNA NGS Data Generation and Upstream Processing
Identification of Differentially Expressed Small RNAs
Functional Analysis of Identified Small RNAs

Genotyping and Genomic Variation Discovery by Whole Genome Resequencing
Data Preprocessing, Mapping, Realignment, and Recalibration
Single Nucleotide Variant (SNV) and Indel Calling
Structural Variant (SV) Calling
Annotation of Called Variants
Testing of Variant Association with Diseases or Traits

De novo Genome Assembly from NGS Reads
Genomic Factors and Sequencing Strategies for de novo Assembly
Assembly of Contigs
Assembly Quality Evaluation
Gap Closure
Limitations and Future Development

Mapping Protein–DNA Interactions with ChIP-Seq
Principle of ChIP-Seq
Experimental Design
Read Mapping, Peak Calling, and Peak Visualization
Differential Binding Analysis
Functional Analysis
Motif Analysis
Integrated ChIP-Seq Data Analysis

Epigenomics and DNA Methylation Analysis by NGS
DNA Methylation Sequencing Strategies
DNA Methylation Sequencing Data Analysis
Detection of Differentially Methylated Cytosines or Regions
Data Verification, Validation, and Interpretation

Metagenome Analysis by NGS
Experimental Design and Sample Preparation
Sequencing Approaches
Overview of Whole-Genome Shotgun (WGS) Metagenome Sequencing Data Analysis
Sequencing Data Quality Control and Preprocessing
Taxonomic Characterization of a Microbial Community
Functional Characterization of a Microbial Community
Comparative Metagenomic Analysis
Integrated Metagenomics Data Analysis Pipelines
Metagenomics Data Repositories

The Changing Landscape of NGS Technologies and Data Analysis
What Is Next for NGS?

The Changing Landscape of NGS
Rapid Evolution and Growth of Bioinformatics Tools for High-Throughput Sequencing Data Analysis
Standardization and Streamlining of NGS Analytic Pipelines
Parallel Computing
Cloud Computing

Appendix A: Common File Types Used in NGS Data Analysis
Appendix B: Glossary



View More



Dr. Xinkun "Sequen" Wang is the director of the NUSeq Core Facility and research associate professor in the Department of Biochemistry and Molecular Genetics at Northwestern University. He was previously an associate research professor of neurogenomics in the Higuchi Biosciences Center and Department of Pharmacology and Toxicology at the University of Kansas, where he was also the director of the Genomics Facility and Genome Sequencing Core. Dr. Wang’s research focuses on unraveling genomic changes that underlie neurodegeneration in brain aging and neurodegenerative diseases, such as Alzheimer’s disease.