Statistical Data Mining Using SAS Applications: 2nd Edition (Hardback) book cover

Statistical Data Mining Using SAS Applications

2nd Edition

By George Fernandez

CRC Press

477 pages | 151 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781439810750
pub: 2010-06-18
eBook (VitalSource) : 9780429131875
pub: 2010-06-18
from $62.50

FREE Standard Shipping!


Statistical Data Mining Using SAS Applications, Second Edition describes statistical data mining concepts and demonstrates the features of user-friendly data mining SAS tools. Integrating the statistical and graphical analysis tools available in SAS systems, the book provides complete statistical data mining solutions without writing SAS program codes or using the point-and-click approach. Each chapter emphasizes step-by-step instructions for using SAS macros and interpreting the results. Compiled data mining SAS macro files are available for download on the author’s website. By following the step-by-step instructions and downloading the SAS macros, analysts can perform complete data mining analysis fast and effectively.

New to the Second Edition—General Features

  • Access to SAS macros directly from desktop
  • Compatible with SAS version 9, SAS Enterprise Guide, and SAS Learning Edition
  • Reorganization of all help files to an appendix
  • Ability to create publication quality graphics
  • Macro-call error check

New Features in These SAS-Specific Macro Applications

  • Converting PC data files to SAS data (EXLSAS2 macro)
  • Randomly splitting data (RANSPLIT2)
  • Frequency analysis (FREQ2)
  • Univariate analysis (UNIVAR2)
  • PCA and factor analysis (FACTOR2)
  • Multiple linear regressions (REGDIAG2)
  • Logistic regression (LOGIST2)
  • CHAID analysis (CHAID2)

Requiring no experience with SAS programming, this resource supplies instructions and tools for quickly performing exploratory statistical methods, regression analysis, logistic regression multivariate methods, and classification analysis. It presents an accessible, SAS macro-oriented approach while offering comprehensive data mining solutions.


Its key features include the provision of case studies throughout the sections, downloadable macros and instructions on how to run them. … The step-by-step instructions and the graphical representations of data make it particularly useful to those wishing to communicate complex and technical data to a largely non-specialist audiences.

—Kassim S. Mwitondi, Journal of Applied Statistics, 2012

If I had to recommend a good introduction to data mining, I would choose this one.

— J. A. Pardo, Complutense University of Madrid, Madrid, Spain, in Statistical Papers, 2012

Like the first edition of the book, this new edition provides a high-level introduction to some important concepts and algorithms in data mining. … the author presents broad statistical data mining solutions without writing SAS program codes. One of the nicest features of this book is that it gives access to SAS macros directly from the desktop and offers to create publication quality graphs. … this new edition provides a simple and straightforward introduction to data mining, along with a number of detailed, worked case studies.

Technometrics, February 2011

Praise for the First Edition:

The macros integrate nicely with SAS’s output delivery system … . this is a book that could serve as an easy-to read introduction to some classical statistical techniques that are used in data mining, and, with the associated macros, provide an opportunity to see those techniques in action.

Journal of the American Statistical Association, June 2004, Vol. 99, No. 466

Use of these data mining SAS macros facilitated reliable conversion, examination, and analysis of the data, and selection of best statistical models despite the great size of the data sets. …

—Christopher Ross, US Bureau of Land Management

An excellent treatment of data mining using SAS applications is provided in this book. … This book would be suitable for students (as a textbook), data analysts, and experienced SAS programmers. No SAS programming experience, however, is required to benefit from the book.

Computing Reviews, June 2003

… the book provides a welcome contrast to treatments of data mining that focus on only the most novel aspects of the subject. Dr. Fernandez is quite right in pointing out that a lot of data mining can be carried out by standard statistical methods in familiar packages. The book also has a healthy emphasis on the use of cross validation (a hallmark of data mining). This and other concepts are well illustrated with numerous examples. Finally, the book demonstrates that the fancy (and expensive) user interfaces sported by many data mining work benches are not essential to the data mining enterprise and might even be counterproductive.

Computational Statistics, 2005

Table of Contents

Data Mining: A Gentle Introduction


Data Mining: Why It Is Successful in the IT World

Benefits of Data Mining

Data Mining: Users

Data Mining: Tools

Data Mining: Steps

Problems in the Data Mining Process

SAS Software the Leader in Data Mining

Introduction of User-Friendly SAS Macros for Statistical Data Mining

Preparing Data for Data Mining


Data Requirements in Data Mining

Ideal Structures of Data for Data Mining

Understanding the Measurement Scale of Variables

Entire Database or Representative Sample

Sampling for Data Mining

User-Friendly SAS Applications Used in Data Preparation

Exploratory Data Analysis


Exploring Continuous Variables

Data Exploration: Categorical Variable

SAS Macro Applications Used in Data Exploration

Unsupervised Learning Methods


Applications of Unsupervised Learning Methods

Principal Component Analysis (PCA)

Exploratory Factor Analysis (EFA)

Disjoint Cluster Analysis (DCA)

Biplot Display of PCA, EFA, and DCA Results

PCA and EFA Using SAS Macro FACTOR2

Disjoint Cluster Analysis Using SAS Macro DISJCLS2

Supervised Learning Methods: Prediction


Applications of Supervised Predictive Methods

Multiple Linear Regression Modeling

Binary Logistic Regression Modeling

Ordinal Logistic Regression

Survey Logistic Regression

Multiple Linear Regression Using SAS Macro REGDIAG2

Lift Chart Using SAS Macro LIFT2

Scoring New Regression Data Using the SAS Macro RSCORE2

Logistic Regression Using SAS Macro LOGIST2

Scoring New Logistic Regression Data Using the SAS Macro LSCORE2

Case Study 1: Modeling Multiple Linear Regressions

Case Study 2: If-Then Analysis and Lift Charts

Case Study 3: Modeling Multiple Linear Regression with Categorical Variables

Case Study 4: Modeling Binary Logistic Regression

Case Study 5: Modeling Binary Multiple Logistic Regression

Case Study 6: Modeling Ordinal Multiple Logistic Regression

Supervised Learning Methods: Classification


Discriminant Analysis

Stepwise Discriminant Analysis

Canonical Discriminant Analysis

Discriminant Function Analysis

Applications of Discriminant Analysis

Classification Tree Based on CHAID

Applications of CHAID

Discriminant Analysis Using SAS Macro DISCRIM2

Decision Tree Using SAS Macro CHAID2

Case Study 1: Canonical Discriminant Analysis and Parametric Discriminant Function Analysis

Case Study 2: Nonparametric Discriminant Function Analysis

Case Study 3: Classification Tree Using CHAID

Advanced Analytics and Other SAS Data Mining Resources


Artificial Neural Network Methods

Market Basket Analysis

SAS Software: The Leader in Data Mining

Appendix I: Instruction for Using the SAS Macros

Appendix II: Data Mining SAS Macro Help Files

Appendix III: Instruction for Using the SAS Macros with Enterprise Guide Code Window


A Summary and References appear at the end of each chapter.

About the Author

George Fernandez is a professor of applied statistical methods and the director of the Center for Research Design and Analysis at the University of Nevada in Reno.

About the Series

Chapman & Hall/CRC Data Mining and Knowledge Discovery Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Database Management / Data Mining
MATHEMATICS / Probability & Statistics / General