1st Edition
Multi-Label Dimensionality Reduction
Similar to other data mining and machine learning tasks, multi-label learning suffers from dimensionality. An effective way to mitigate this problem is through dimensionality reduction, which extracts a small number of features by removing irrelevant, redundant, and noisy information. The data mining and machine learning literature currently lacks a unified treatment of multi-label dimensionality reduction that incorporates both algorithmic developments and applications.
Addressing this shortfall, Multi-Label Dimensionality Reduction covers the methodological developments, theoretical properties, computational aspects, and applications of many multi-label dimensionality reduction algorithms. It explores numerous research questions, including:
- How to fully exploit label correlations for effective dimensionality reduction
- How to scale dimensionality reduction algorithms to large-scale problems
- How to effectively combine dimensionality reduction with classification
- How to derive sparse dimensionality reduction algorithms to enhance model interpretability
- How to perform multi-label dimensionality reduction effectively in practical applications
The authors emphasize their extensive work on dimensionality reduction for multi-label learning. Using a case study of Drosophila gene expression pattern image annotation, they demonstrate how to apply multi-label dimensionality reduction algorithms to solve real-world problems. A supplementary website provides a MATLAB® package for implementing popular dimensionality reduction algorithms.
Introduction
Introduction to Multi-Label Learning
Applications of Multi-Label Learning
Challenges of Multi-Label Learning
State of the Art
Dimensionality Reduction for Multi-Label Learning
Overview of the Book
Notations
Organization
Partial Least Squares
Basic Models of Partial Least Squares
Partial Least Squares Variants
Partial Least Squares Regression
Partial Least Squares Classification
Canonical Correlation Analysis
Classical Canonical Correlation
Sparse CCA
Relationship between CCA and Partial Least Squares
The Generalized Eigenvalue Problem
Hypergraph Spectral Learning
Hypergraph Basics
Multi-Label Learning with a Hypergraph
A Class of Generalized Eigenvalue Problems
The Generalized Eigenvalue Problem versus the Least Squares Problem
Empirical Evaluation
A Scalable Two-Stage Approach for Dimensionality Reduction
The Two-Stage Approach with Regularization
Empirical Evaluation
A Shared-Subspace Learning Framework
The Framework
An Efficient Implementation
Related Work
Connections with Existing Formulations
A Feature Space Formulation
Empirical Evaluation
Joint Dimensionality Reduction and Classification
Background
Joint Dimensionality Reduction and Multi-Label Classification
Dimensionality Reduction with Different Input Data
Empirical Evaluation
Nonlinear Dimensionality Reduction: Algorithms and Applications
Background on Kernel Methods
Kernel Centering and Projection
Kernel Canonical Correlation Analysis
Kernel Hypergraph Spectral Learning
The Generalized Eigenvalue Problem in the Kernel-Induced Feature Space
Kernel Least Squares Regression
Dimensionality Reduction and Least Squares Regression in the Feature Space
Gene Expression Pattern Image Annotation
Appendix: Proofs
References
Index
Biography
Liang Sun is a scientist in the R&D of Opera Solutions, a leading company in big data science and predictive analytics. He received a PhD in computer science from Arizona State University. His research interests lie broadly in the areas of data mining and machine learning. His team won second place in the KDD Cup 2012 Track 2 and fifth place in the Heritage Health Prize. In 2010, he won the ACM SIGKDD best research paper honorable mention for his work on an efficient implementation for a class of dimensionality reduction algorithms.
Shuiwang Ji is an assistant professor of computer science at Old Dominion University. He received a PhD in computer science from Arizona State University. His research interests include machine learning, data mining, computational neuroscience, and bioinformatics. He received the Outstanding PhD Student Award from Arizona State University in 2010 and the Early Career Distinguished Research Award from Old Dominion University’s College of Sciences in 2012.
Jieping Ye is an associate professor of computer science and engineering at Arizona State University, where he is also the associate director for big data informatics in the Center for Evolutionary Medicine and Informatics and a core faculty member of the Biodesign Institute. He received a PhD in computer science from the University of Minnesota, Twin Cities. His research interests include machine learning, data mining, and biomedical informatics. He is an associate editor of IEEE Transactions on Pattern Analysis and Machine Intelligence. He has won numerous awards from Arizona State University and was a recipient of an NSF CAREER Award. His papers have also been recognized at the International Conference on Machine Learning, KDD, and the SIAM International Conference on Data Mining (SDM).