Spectral Feature Selection for Data Mining  book cover
1st Edition

Spectral Feature Selection for Data Mining

ISBN 9781439862094
Published December 14, 2011 by Chapman and Hall/CRC
220 Pages 53 B/W Illustrations

FREE Standard Shipping
SAVE $46.00
was $230.00
USD $184.00

Prices & shipping based on shipping country


Book Description

Spectral Feature Selection for Data Mining introduces a novel feature selection technique that establishes a general platform for studying existing feature selection algorithms and developing new algorithms for emerging problems in real-world applications. This technique represents a unified framework for supervised, unsupervised, and semisupervised feature selection.

The book explores the latest research achievements, sheds light on new research directions, and stimulates readers to make the next creative breakthroughs. It presents the intrinsic ideas behind spectral feature selection, its theoretical foundations, its connections to other algorithms, and its use in handling both large-scale data sets and small sample problems. The authors also cover feature selection and feature extraction, including basic concepts, popular existing algorithms, and applications.

A timely introduction to spectral feature selection, this book illustrates the potential of this powerful dimensionality reduction technique in high-dimensional data processing. Readers learn how to use spectral feature selection to solve challenging problems in real-life applications and discover how general feature selection and extraction are connected to spectral feature selection.

Table of Contents

Data of High Dimensionality and Challenges
Dimensionality Reduction Techniques
Feature Selection for Data Mining
Spectral Feature Selection
Organization of the Book

Univariate Formulations for Spectral Feature Selection
Modeling Target Concept via Similarity Matrix
The Laplacian Matrix of a Graph
Evaluating Features on the Graph
An Extension for Feature Ranking Functions
Spectral Feature Selection via Ranking
Robustness Analysis for SPEC

Multivariate Formulations
The Similarity Preserving Nature of SPEC
A Sparse Multi-Output Regression Formulation
Solving the L2,1-Regularized Regression Problem
Efficient Multivariate Spectral Feature Selection
A Formulation Based on Matrix Comparison
Feature Selection with Proposed Formulations

Connections to Existing Algorithms
Connections to Existing Feature Selection Algorithms
Connections to Other Learning Models
An Experimental Study of the Algorithms

Large-Scale Spectral Feature Selection
Data Partitioning for Parallel Processing
MPI for Distributed Parallel Computing
Parallel Spectral Feature Selection
Computing the Similarity Matrix in Parallel
Parallelization of the Univariate Formulations
Parallel MRSF
Parallel MCSF

Multi-Source Spectral Feature Selection
Categorization of Different Types of Knowledge
A Framework Based on Combining Similarity Matrices
A Framework Based on Rank Aggregation
Experimental Results



View More



Zheng Zhao is a research statistician at the SAS Institute, Inc. His recent research focuses on designing and developing novel analytic approaches for handling large-scale data of extremely high dimensionality. Dr. Zhao is the author of PROC HPREDUCE, which is a SAS High Performance Analytics procedure for large-scale parallel variable selection. He was co-chair of the 2010 PAKDD Workshop on Feature Selection in Data Mining. He earned a Ph.D. in computer science and engineering from Arizona State University.

Huan Liu is a professor of computer science and engineering at Arizona State University. Dr. Liu serves on journal editorial boards and conference program committees and is a founding organizer of the International Conference Series on Social Computing, Behavioral-Cultural Modeling, and Prediction. He earned a Ph.D. in computer science from the University of Southern California. With a focus on data mining, machine learning, social computing, and artificial intelligence, his research investigates problems in real-world application with high-dimensional data of disparate forms, such as social media, group interaction and modeling, data preprocessing, and text/web mining.