Identifying some of the most influential algorithms that are widely used in the data mining community, The Top Ten Algorithms in Data Mining provides a description of each algorithm, discusses its impact, and reviews current and future research. Thoroughly evaluated by independent reviewers, each chapter focuses on a particular algorithm and is written by either the original authors of the algorithm or world-class researchers who have extensively studied the respective algorithm.
The book concentrates on the following important algorithms: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. Examples illustrate how each algorithm works and highlight its overall performance in a real-world application. The text covers key topics—including classification, clustering, statistical learning, association analysis, and link mining—in data mining research and development as well as in data mining, machine learning, and artificial intelligence courses.
By naming the leading algorithms in this field, this book encourages the use of data mining techniques in a broader realm of real-world applications. It should inspire more data mining researchers to further explore the impact and novel research issues of these algorithms.
Table of Contents
C4.5, Naren Ramakrishnan
K-Means, Joydeep Ghosh and Alexander Liu
SVM: Support Vector Machines, Hui Xue, Qiang Yang, and Songcan Chen
Apriori, Hiroshi Motoda and Kouzou Ohara
EM, Geoffrey J. McLachlan and Shu-Kay Ng
PageRank, Bing Liu and Philip S. Yu
AdaBoost, Zhi-Hua Zhou and Yang Yu
kNN: k-Nearest Neighbors, Michael Steinbach and Pang-Ning Tan
Naïve Bayes, David J. Hand
CART: Classification and Regression Trees, Dan Steinberg
… The text is easy to read as each chapter focuses on a particular algorithm and a consistent presentation style has been adopted throughout the book … Each chapter was reviewed by two independent reviewers and one of the book editors—resulting in a text that will be a useful reference source for years to come.
—International Statistical Review, 2010
If you are a quality professional looking for data analysis techniques beyond multiple regression, and you are comfortable reading high level mathematics, then this book may be for you.
—Journal of Quality Technology, Vol. 41, No. 4, October 2009