In machine learning applications, practitioners must take into account the cost associated with the algorithm. These costs include:
- Cost of acquiring training data
- Cost of data annotation/labeling and cleaning
- Computational cost for model fitting, validation, and testing
- Cost of collecting features/attributes for test data
- Cost of user feedback collection
- Cost of incorrect prediction/classification
Cost-Sensitive Machine Learning is one of the first books to provide an overview of the current research efforts and problems in this area. It discusses real-world applications that incorporate the cost of learning into the modeling process.
The first part of the book presents the theoretical underpinnings of cost-sensitive machine learning. It describes well-established machine learning approaches for reducing data acquisition costs during training as well as approaches for reducing costs when systems must make predictions for new samples. The second part covers real-world applications that effectively trade off different types of costs. These applications not only use traditional machine learning approaches, but they also incorporate cutting-edge research that advances beyond the constraining assumptions by analyzing the application needs from first principles.
Spurring further research on several open problems, this volume highlights the often implicit assumptions in machine learning techniques that were not fully understood in the past. The book also illustrates the commercial importance of cost-sensitive machine learning through its coverage of the rapid application developments made by leading companies and academic research labs.
Table of Contents
THEORECTICAL UNDERPINNINGS OF COST-SENSTIVE MACHINE LEARNING: Algorithms for Active Learning. Semi-Supervised Learning: Some Recent Advances. Transfer Learning, Multi-Task Learning, and Cost-Sensitive Learning. Cost-Sensitive Cascades. Selective Data Acquisition for Machine Learning. COST-SENSITIVE MACHINE LEARNING APPLICATIONS: Minimizing Annotation Costs in Visual Category Learning. Reliability and Redundancy: Reducing Error Cost in Medical Imaging. Cost-Sensitive Learning in Computational Advertising. Cost-Sensitive Machine Learning for Information Retrieval. Index.
Balaji Krishnapuram is a senior R&D manager at Siemens Medical Solutions. He earned a Ph.D. in electrical and computer engineering from Duke University. His research interests include statistical data mining and information retrieval.
Shipeng Yu is a senior staff scientist at Siemens Medical Solutions. He earned a Ph.D. in computer science from the University of Munich. His research interests include statistical machine learning, data mining, Bayesian analysis, information retrieval and extraction, healthcare analytics, and personalized medicine.
R. Bharat Rao is senior director and head of Knowledge Solutions at Siemens Medical Solutions, where was recognized as one of its Inventors of the Year in 2005. He also received the 2011 ACM SIGKDD Lifetime Service Award for pioneering applications of data mining for healthcare. He earned a Ph.D. in electrical and computer engineering from the University of Illinois at Urbana-Champaign. His research interests include machine learning, healthcare analytics, mining large data, and personalized medicine.