1st Edition

The Top Ten Algorithms in Data Mining

Edited By Xindong Wu, Vipin Kumar Copyright 2009
    230 Pages 53 B/W Illustrations
    by Chapman & Hall

    Identifying some of the most influential algorithms that are widely used in the data mining community, The Top Ten Algorithms in Data Mining provides a description of each algorithm, discusses its impact, and reviews current and future research. Thoroughly evaluated by independent reviewers, each chapter focuses on a particular algorithm and is written by either the original authors of the algorithm or world-class researchers who have extensively studied the respective algorithm.

    The book concentrates on the following important algorithms: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. Examples illustrate how each algorithm works and highlight its overall performance in a real-world application. The text covers key topics—including classification, clustering, statistical learning, association analysis, and link mining—in data mining research and development as well as in data mining, machine learning, and artificial intelligence courses.

    By naming the leading algorithms in this field, this book encourages the use of data mining techniques in a broader realm of real-world applications. It should inspire more data mining researchers to further explore the impact and novel research issues of these algorithms.

    C4.5, Naren Ramakrishnan

    K-Means, Joydeep Ghosh and Alexander Liu

    SVM: Support Vector Machines, Hui Xue, Qiang Yang, and Songcan Chen

    Apriori, Hiroshi Motoda and Kouzou Ohara

    EM, Geoffrey J. McLachlan and Shu-Kay Ng

    PageRank, Bing Liu and Philip S. Yu

    AdaBoost, Zhi-Hua Zhou and Yang Yu

    kNN: k-Nearest Neighbors, Michael Steinbach and Pang-Ning Tan

    Naïve Bayes, David J. Hand

    CART: Classification and Regression Trees, Dan Steinberg



    Xindong Wu, Vipin Kumar

    … The text is easy to read as each chapter focuses on a particular algorithm and a consistent presentation style has been adopted throughout the book … Each chapter was reviewed by two independent reviewers and one of the book editors—resulting in a text that will be a useful reference source for years to come.
    International Statistical Review, 2010

    If you are a quality professional looking for data analysis techniques beyond multiple regression, and you are comfortable reading high level mathematics, then this book may be for you.
    Journal of Quality Technology, Vol. 41, No. 4, October 2009