Comprehensive Coverage of the Entire Area of Classification
Research on the problem of classification tends to be fragmented across such areas as pattern recognition, database, data mining, and machine learning. Addressing the work of these different communities in a unified way, Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data.
This comprehensive book focuses on three primary aspects of data classification:
- Methods: The book first describes common techniques used for classification, including probabilistic methods, decision trees, rule-based methods, instance-based methods, support vector machine methods, and neural networks.
- Domains: The book then examines specific methods used for data domains such as multimedia, text, time-series, network, discrete sequence, and uncertain data. It also covers large data sets and data streams due to the recent importance of the big data paradigm.
- Variations: The book concludes with insight on variations of the classification process. It discusses ensembles, rare-class learning, distance function learning, active learning, visual learning, transfer learning, and semi-supervised learning as well as evaluation aspects of classifiers.
Table of Contents
An Introduction to Data Classification Charu C. Aggarwal
Feature Selection for Classification: A Review Jiliang Tang, Salem Alelyani, and Huan Liu
Probabilistic Models for Classification Hongbo Deng, Yizhou Sun, Yi Chang, and Jiawei Han
Decision Trees: Theory and Algorithms Victor E. Lee, Lin Liu, and Ruoming Jin
Rule-Based Classification Xiao-Li Li and Bing Liu
Instance-Based Learning: A Survey Charu C. Aggarwal
Support Vector Machines Po-Wei Wang and Chih-Jen Lin
Neural Networks: A Review Alain Biem
A Survey of Stream Classification Algorithms Charu C. Aggarwal
Big Data Classification Hanghang Tong
Text Classification Charu C. Aggarwal and ChengXiang Zhai
Multimedia Classification Shiyu Chang, Wei Han, Xianming Liu, Ning Xu, Pooya Khorrami, and Thomas S. Huang
Time Series Data Classification Dimitrios Kotsakos and Dimitrios Gunopulos
Discrete Sequence Classification Mohammad Al Hasan
Collective Classification of Network Data Ben London and Lise Getoor
Uncertain Data Classification Reynold Cheng, Yixiang Fang, and Matthias Renz
Rare Class Learning Charu C. Aggarwal
Distance Metric Learning for Data Classification Fei Wang
Ensemble Learning Yaliang Li, Jing Gao, Qi Li, and Wei Fan
Semi-Supervised Learning Kaushik Sinha
Transfer Learning Sinno Jialin Pan
Active Learning: A Survey Charu C. Aggarwal, Xiangnan Kong, Quanquan Gu, Jiawei Han, and Philip S. Yu
Visual Classification Giorgio Maria Di Nunzio
Evaluation of Classification Methods Nele Verbiest, Karel Vermeulen, and Ankur Teredesai
Educational and Software Resources for Data Classification Charu C. Aggarwal
Charu C. Aggarwal is a research scientist at the IBM T.J. Watson Research Center. A fellow of the IEEE and the ACM, he is the author/editor of ten books, an associate editor of several journals, and the vice-president of the SIAM Activity Group on Data Mining. Dr. Aggarwal has published over 200 papers, has applied for or been granted over 80 patents, and has received numerous honors, including the IBM Outstanding Technical Achievement Award and EDBT 2014 Test of Time Award. His research interests include performance analysis, databases, and data mining. He earned a Ph.D. from the Massachusetts Institute of Technology.