Exploratory Data Analysis with MATLAB, Second Edition

By Wendy L. Martinez, Angel Martinez, Jeffrey Solka

© 2010 – CRC Press

536 pages | 15 Color Illus. | 133 B/W Illus.

Purchasing Options:
Hardback: 9781439812204
pub: 2010-12-16
US Dollars$99.95

e–Inspection Copy

About the Book

Since the publication of the bestselling first edition, many advances have been made in exploratory data analysis (EDA). Covering innovative approaches for dimensionality reduction, clustering, and visualization, Exploratory Data Analysis with MATLAB®, Second Edition uses numerous examples and applications to show how the methods are used in practice.

New to the Second Edition

  • Discussions of nonnegative matrix factorization, linear discriminant analysis, curvilinear component analysis, independent component analysis, and smoothing splines
  • An expanded set of methods for estimating the intrinsic dimensionality of a data set
  • Several clustering methods, including probabilistic latent semantic analysis and spectral-based clustering
  • Additional visualization methods, such as a rangefinder boxplot, scatterplots with marginal histograms, biplots, and a new method called Andrews’ images
  • Instructions on a free MATLAB GUI toolbox for EDA

Like its predecessor, this edition continues to focus on using EDA methods, rather than theoretical aspects. The MATLAB codes for the examples, EDA toolboxes, data sets, and color versions of all figures are available for download at http://pi-sigma.info


"This book presents a broad panoply of data-analytical methods implemented in MATLAB. … the amount of material covered is impressive. The explanations are clear, and the fluid style makes reading pleasant. … very useful for the applied statistician. Its material may also be employed as a complement to a more theoretical-oriented course."

—R. Maronna, Statistical Papers, Vol. 55, 2014

"The book is very helpful for applied data analysts as an excellent compact overview of popular available methods supplied with a MATLAB code. … Common features and differences between various methods are carefully explained and the book is well understandable from the perspective of the users. … The book, written by very experienced authors, can be strongly recommended as an excellent manual for MATLAB users who need to extract information from their data."

—Jan Kalina, ISCB Newsletter, June 2013

"The authors present an intuitive and easy-to-read book. … accompanied by many examples, proposed exercises, good references, and comprehensive appendices that initiate the reader unfamiliar with MATLAB. … a great contribution to the field of data analysis, which I am sure will be useful for researchers and practitioners."

—Adolfo Alvarez Pinto, International Statistical Review (2011), 79

"Practitioners of EDA who use MATLAB will want a copy of this book. … The authors discuss many EDA methods, including graphical approaches. With the book comes the EDA Toolbox (downloadable from the text website) for use with MATLAB. It contains code for all of the algorithms discussed in the text.

… the authors strategically inject helpful observations and guidance into the examples throughout the book.

… this book does not merely document routines; it shows how to do EDA. The helpful summaries, intuitive explanations, and comprehensive examples make the text so much more than a software cookbook. … The authors have done a great service by bringing together so many EDA routines, but their main accomplishment in this dynamic text is providing the understanding and tools to do EDA.

This text, along with the EDA Toolbox, is an excellent resource. Even readers with limited background can quickly be analyzing data and plotting it in interesting ways. For practitioners of EDA who use MATLAB, and ideally also the Statistics Toolbox, I highly recommend this book."

MAA Reviews, April 2011

Praise for the First Edition:

"This book … has a good introduction to EDA, and then illustrates several applications where MATLAB provides the analysis of data to produce unexpected results."


"The audience for the book is a wide one and includes statisticians, computer scientists, and others who may be interested in or use EDA. … I found the book to be engagingly written, and successful in its defined task of teaching the reader to use EDA with MATLAB. I liked the graphics and thought that they fully illustrated the techniques used."

—Brian Jersky, Journal of the American Statistical Association

"The book can also be useful in a classroom setting at the senior undergraduate and graduate level, valuable exercises being included in each chapter."

—Neculai Curteanu, Zentralblatt MATH

Table of Contents


Introduction to Exploratory Data Analysis

What Is Exploratory Data Analysis

Overview of the Text

A Few Words about Notation

Data Sets Used in the Book

Transforming Data


Dimensionality Reduction - Linear Methods


Principal Component Analysis (PCA)

Singular Value Decomposition (SVD)

Nonnegative Matrix Factorization

Factor Analysis

Fisher’s Linear Discriminant

Intrinsic Dimensionality

Dimensionality Reduction - Nonlinear Methods

Multidimensional Scaling (MDS)

Manifold Learning

Artificial Neural Network Approaches

Data Tours

Grand Tour

Interpolation Tours

Projection Pursuit

Projection Pursuit Indexes

Independent Component Analysis

Finding Clusters


Hierarchical Methods

Optimization Methods—k-Means

Spectral Clustering

Document Clustering

Evaluating the Clusters

Model-Based Clustering

Overview of Model-Based Clustering

Finite Mixtures

Expectation-Maximization Algorithm

Hierarchical Agglomerative Model-Based Clustering

Model-Based Clustering

MBC for Density Estimation and Discriminant Analysis

Generating Random Variables from a Mixture Model

Smoothing Scatterplots



Robust Loess

Residuals and Diagnostics with Loess

Smoothing Splines

Choosing the Smoothing Parameter

Bivariate Distribution Smooths

Curve Fitting Toolbox


Visualizing Clusters



Rectangle Plots

ReClus Plots

Data Image

Distribution Shapes



Quantile Plots


Rangefinder Boxplot

Multivariate Visualization

Glyph Plots


Dynamic Graphics


Dot Charts

Plotting Points as Curves

Data Tours Revisited


Appendix A: Proximity Measures

Appendix B: Software Resources for EDA

Appendix C: Description of Data Sets

Appendix D: Introduction to MATLAB

Appendix E: MATLAB Functions



Summary, Further Reading, and Exercises appear at the end of each chapter.

About the Authors

Wendy L. Martinez has been in government service for over 20 years, working with leading researchers from academia, industry, and government labs. During this time, she has conducted and published research in text data mining, probability density estimation, signal processing, scientific visualization, and statistical pattern recognition. A fellow of the American Statistical Association, she earned an M.S. in aerospace engineering from George Washington University and a Ph.D. in computational sciences and informatics from George Mason University.

Angel R. Martinez teaches undergraduate and graduate courses in statistics and mathematics at Strayer University. Before retiring from government service, he worked for the U.S. Navy as an operations research analyst and a computer scientist. He earned an M.S. in systems engineering from the Virginia Polytechnic Institute and State University and a Ph.D. in computational sciences and informatics from George Mason University.

Since 1984, Jeffrey L. Solka has been working in statistical pattern recognition for the Department of the Navy. He has published over 120 journal, conference, and technical papers; has won numerous awards; and holds 4 patents. He earned an M.S. in mathematics from James Madison University, an M.S. in physics from Virginia Polytechnic Institute and State University, and a Ph.D. in computational sciences and informatics from George Mason University.

About the Series

Chapman & Hall/CRC Computer Science & Data Analysis

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
MATHEMATICS / Probability & Statistics / General