1st Edition

Data Mining with R
Learning with Case Studies

  • This version cannot be shipped to your selected country.
ISBN 9781439810187
Published November 9, 2010 by Chapman and Hall/CRC
305 Pages - 42 B/W Illustrations

USD $93.95

Prices & shipping based on shipping country


Book Description

The versatile capabilities and large set of add-on packages make R an excellent alternative to many existing and often expensive data mining tools. Exploring this area from the perspective of a practitioner, Data Mining with R: Learning with Case Studies uses practical examples to illustrate the power of R and data mining.

Assuming no prior knowledge of R or data mining/statistical techniques, the book covers a diverse set of problems that pose different challenges in terms of size, type of data, goals of analysis, and analytical tools. To present the main data mining processes and techniques, the author takes a hands-on approach that utilizes a series of detailed, real-world case studies:

  1. Predicting algae blooms
  2. Predicting stock market returns
  3. Detecting fraudulent transactions
  4. Classifying microarray samples

With these case studies, the author supplies all necessary steps, code, and data.

Web Resource
A supporting website mirrors the do-it-yourself approach of the text. It offers a collection of freely available R source files that encompass all the code used in the case studies. The site also provides the data sets from the case studies as well as an R package of several functions.

Table of Contents

How to Read This Book
A Short Introduction to R
A Short Introduction to MySQL

Predicting Algae Blooms
Problem Description and Objectives
Data Description
Loading the Data into R
Data Visualization and Summarization
Unknown Values
Obtaining Prediction Models
Model Evaluation and Selection
Predictions for the 7 Algae

Predicting Stock Market Returns
Problem Description and Objectives
The Available Data
Defining the Prediction Tasks
The Prediction Models
From Predictions into Actions
Model Evaluation and Selection
The Trading System

Detecting Fraudulent Transactions
Problem Description and Objectives
The Available Data
Defining the Data Mining Tasks
Obtaining Outlier Rankings

Classifying Microarray Samples
Problem Description and Objectives
The Available Data
Gene (Feature) Selection
Predicting Cytogenetic Abnormalities



Index of Data Mining Topics

Index of R Functions

View More



Luis Torgo is an associate professor in the Department of Computer Science at the University of Porto in Portugal. An active researcher in machine learning and data mining for more than 20 years, Dr. Torgo is also a researcher in the Laboratory of Artificial Intelligence and Data Analysis (LIAAD) of INESC Porto LA.

Featured Author Profiles

Author - Luis  Torgo

Luis Torgo

Associate Professor of the Department of Computer Science of the Faculty of Sciences, University of Porto

Learn more about Luis Torgo »


This is certainly one of the best books for a direct implementation of data mining algorithms. Another good point of the book is that for most of the problems there are different ways to solve them. … an invaluable resource for data miners, R programmers, as well as people involved in fields such as fraud detection and stock market prediction. If you’re serious about data mining and want to learn from experiences in the field, don’t hesitate!
—Sandro Saitta, Data Mining Research blog, May 2011

If you want to learn how to analyze your data with a free software package that has been built by expert statisticians and data miners, this is your book. A broad range of real-world case studies highlights the breadth and depth of the R software.
—Bernhard Pfahringer, University of Waikato, New Zealand

Both R novices and experts will find this a great reference for data mining.
Intelligent Trading blog and R-bloggers, November 2010