Data Mining: A Tutorial-Based Primer, Second Edition, 2nd Edition (e-Book) book cover

Data Mining

A Tutorial-Based Primer, Second Edition, 2nd Edition

By Richard J. Roiger

Chapman and Hall/CRC

487 pages | 295 B/W Illus.

Purchasing Options:$ = USD
Pack - Book and Ebook: 9781498763974
pub: 2016-12-01
SAVE ~$16.39
$81.95
$65.56
x
eBook (VitalSource) : 9781315382586
pub: 2017-01-06
from $40.98


FREE Standard Shipping!

Description

Data Mining: A Tutorial-Based Primer, Second Edition provides a comprehensive introduction to data mining with a focus on model building and testing, as well as on interpreting and validating results. The text guides students to understand how data mining can be employed to solve real problems and recognize whether a data mining solution is a feasible alternative for a specific problem. Fundamental data mining strategies, techniques, and evaluation methods are presented and implemented with the help of two well-known software tools.

Several new topics have been added to the second edition including an introduction to Big Data and data analytics, ROC curves, Pareto lift charts, methods for handling large-sized, streaming and imbalanced data, support vector machines, and extended coverage of textual data mining. The second edition contains tutorials for attribute selection, dealing with imbalanced data, outlier analysis, time series analysis, mining textual data, and more.

The text provides in-depth coverage of RapidMiner Studio and Weka’s Explorer interface. Both software tools are used for stepping students through the tutorials depicting the knowledge discovery process. This allows the reader maximum flexibility for their hands-on data mining experience.

Reviews

"Dr. Roiger does an excellent job of describing in step by step detail formulae involved in various data mining algorithms, along with illustrations. In addition, his tutorials in Weka software provide excellent grounding for students in comprehending the underpinnings of Machine Learning as applied to Data Mining. The inclusion of RapidMiner software tutorials and examples in the book is also a definite plus since it is one of the most popular Data Mining software platforms in use today."

--Robert Hughes, Golden Gate University, San Francisco, CA, USA

Table of Contents

Data Mining Fundamentals

Data Mining: A First View

DATA SCIENCE, ANALYTICS, MINING, AND KNOWLEDGE DISCOVERY IN DATABASES 

WHAT CAN COMPUTERS LEARN? 

IS DATA MINING APPROPRIATE FOR MY PROBLEM? 

DATA MINING OR KNOWLEDGE ENGINEERING? 

A NEAREST NEIGHBOR APPROACH

DATA MINING, BIG DATA, AND CLOUD COMPUTING

DATA MINING ETHICS

INTRINSIC VALUE AND CUSTOMER CHURN

CHAPTER SUMMARY 

KEY TERMS

Data Mining: A Closer Look

DATA MINING STRATEGIES

SUPERVISED DATA MINING TECHNIQUES

ASSOCIATION RULES

CLUSTERING TECHNIQUES

EVALUATING PERFORMANCE

CHAPTER SUMMARY

KEY TERMS

Basic Data Mining Techniques

CHAPTER OBJECTIVES

DECISION TREES

A BASIC COVERING RULE ALGORITHM

GENERATING ASSOCIATION RULES

THE K-MEANS ALGORITHM

GENETIC LEARNING

CHOOSING A DATA MINING TECHNIQUE

CHAPTER SUMMARY

KEY TERMS

 

Tools for Knowledge Discovery

Weka—An Environment for Knowledge Discovery

GETTING STARTED WITH WEKA

BUILDING DECISION TREES

GENERATING PRODUCTION RULES WITH PART

ATTRIBUTE SELECTION AND NEAREST NEIGHBOR CLASSIFICATION

ASSOCIATION RULES

COST/BENEFIT ANALYSIS

UNSUPERVISED CLUSTERING WITH THE K-MEANS ALGORITHM

CHAPTER SUMMARY

Knowledge Discovery with RapidMiner

GETTING STARTED WITH RAPIDMINER

BUILDING DECISION TREES

GENERATING RULES

ASSOCIATION RULE LEARNING

UNSUPERVISED CLUSTERING WITH K-MEANS

ATTRIBUTE SELECTION AND NEAREST NEIGHBOR CLASSIFICATION

CHAPTER SUMMARY

The Knowledge Discovery Process

A PROCESS MODEL FOR KNOWLEDGE DISCOVERY

GOAL IDENTIFICATION 2016.3 CREATING A TARGET DATA SET

DATA PREPROCESSING

DATA TRANSFORMATION

DATA MINING

INTERPRETATION AND EVALUATION

TAKING ACTION

THE CRISP-DM PROCESS MODEL

CHAPTER SUMMARY

KEY TERMS

Formal Evaluation Techniques

WHAT SHOULD BE EVALUATED?

TOOLS FOR EVALUATION

COMPUTING TEST SET CONFIDENCE INTERVALS

COMPARING SUPERVISED LEARNER MODELS

UNSUPERVISED EVALUATION TECHNIQUES

EVALUATING SUPERVISED MODELS WITH NUMERIC OUTPUT

COMPARING MODELS WITH RAPIDMINER

ATTRIBUTE EVALUATION FOR MIXED DATA TYPES

PARETO LIFT CHARTS

CHAPTER SUMMARY

KEY TERMS

 

Building Neural Networks

Neural Networks

FEED-FORWARD NEURAL NETWORKS

NEURAL NETWORK TRAINING: A CONCEPTUAL VIEW

NEURAL NETWORK EXPLANATION

GENERAL CONSIDERATIONS

NEURAL NETWORK TRAINING: A DETAILED VIEW

CHAPTER SUMMARY

KEY TERMS

Building Neural Networks with Weka

DATA SETS FOR BACKPROPAGATION LEARNING

MODELING THE EXCLUSIVE-OR FUNCTION: NUMERIC OUTPUT

MODELING THE EXCLUSIVE-OR FUNCTION: CATEGORICAL OUTPUT

MINING SATELLITE IMAGE DATA

UNSUPERVISED NEURAL NET CLUSTERING

CHAPTER SUMMARY

KEY TERMS

Building Neural Networks with RapidMiner

MODELING THE EXCLUSIVE-OR FUNCTION

MINING SATELLITE IMAGE DATA

PREDICTING CUSTOMER CHURN

RAPIDMINER’S SELF-ORGANIZING MAP OPERATOR

CHAPTER SUMMARY

 

Advanced Data Mining Techniques

Supervised Statistical Techniques

BAYES CLASSIFIER

SUPPORT VECTOR MACHINES

LINEAR REGRESSION ANALYSIS

REGRESSION TREES

LOGISTIC REGRESSION

CHAPTER SUMMARY

KEY TERMS

Unsupervised Clustering Techniques

AGGLOMERATIVE CLUSTERING

CONCEPTUAL CLUSTERING

EXPECTATION MAXIMIZATION

GENETIC ALGORITHMS AND UNSUPERVISED CLUSTERING

CHAPTER SUMMARY

KEY TERMS

Specialized Techniques

TIME-SERIES ANALYSIS

MINING THE WEB

MINING TEXTUAL DATA

TECHNIQUES FOR LARGE-SIZED, IMBALANCED, AND STREAMING DATA

ENSEMBLE TECHNIQUES FOR IMPROVING PERFORMANCE

CHAPTER SUMMARY

KEY TERMS

The Data Warehouse

OPERATIONAL DATABASES

DATA WAREHOUSE DESIGN

ONLINE ANALYTICAL PROCESSING

EXCEL PIVOT TABLES FOR DATA ANALYTICS

CHAPTER SUMMARY

KEY TERMS

About the Author

Richard J. Roiger is a professor emeritus at Minnesota State University, Mankato where he taught and performed research in the Computer & Information Science Department for 27 years. Dr. Roiger’s Ph.D. degree is in Computer & Information Sciences from the University of Minnesota. Dr. Roiger continues to serve as a part-time faculty member teaching courses in data mining, artificial intelligence and research methods. Richard enjoys interacting with his grandchildren, traveling, writing and pursuing his musical talents.

About the Series

Chapman & Hall/CRC Data Mining and Knowledge Discovery Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
BUS061000
BUSINESS & ECONOMICS / Statistics
COM012040
COMPUTERS / Programming / Games
COM021030
COMPUTERS / Database Management / Data Mining
COM037000
COMPUTERS / Machine Theory