Spectral Feature Selection for Data Mining: 1st Edition (Paperback) book cover

Spectral Feature Selection for Data Mining

1st Edition

By Zheng Alan Zhao, Huan Liu

Chapman and Hall/CRC

224 pages | 53 B/W Illus.

Book Content Available Open Access*
Read Book - Open Access

*Open Access content has been made available under a Creative Commons Attribution-Non Commercial-No Derivatives (CC-BY-NC-ND) license

Purchasing Options:$ = USD
Paperback: 9781138112629
pub: 2018-04-18
SAVE ~$15.99
Hardback: 9781439862094
pub: 2011-12-14
SAVE ~$44.00

FREE Standard Shipping!


Spectral Feature Selection for Data Mining introduces a novel feature selection technique that establishes a general platform for studying existing feature selection algorithms and developing new algorithms for emerging problems in real-world applications. This technique represents a unified framework for supervised, unsupervised, and semisupervised feature selection.

The book explores the latest research achievements, sheds light on new research directions, and stimulates readers to make the next creative breakthroughs. It presents the intrinsic ideas behind spectral feature selection, its theoretical foundations, its connections to other algorithms, and its use in handling both large-scale data sets and small sample problems. The authors also cover feature selection and feature extraction, including basic concepts, popular existing algorithms, and applications.

A timely introduction to spectral feature selection, this book illustrates the potential of this powerful dimensionality reduction technique in high-dimensional data processing. Readers learn how to use spectral feature selection to solve challenging problems in real-life applications and discover how general feature selection and extraction are connected to spectral feature selection.

Table of Contents

Data of High Dimensionality and Challenges

Dimensionality Reduction Techniques

Feature Selection for Data Mining

Spectral Feature Selection

Organization of the Book

Univariate Formulations for Spectral Feature Selection

Modeling Target Concept via Similarity Matrix

The Laplacian Matrix of a Graph

Evaluating Features on the Graph

An Extension for Feature Ranking Functions

Spectral Feature Selection via Ranking

Robustness Analysis for SPEC


Multivariate Formulations

The Similarity Preserving Nature of SPEC

A Sparse Multi-Output Regression Formulation

Solving the L2,1-Regularized Regression Problem

Efficient Multivariate Spectral Feature Selection

A Formulation Based on Matrix Comparison

Feature Selection with Proposed Formulations

Connections to Existing Algorithms

Connections to Existing Feature Selection Algorithms

Connections to Other Learning Models

An Experimental Study of the Algorithms


Large-Scale Spectral Feature Selection

Data Partitioning for Parallel Processing

MPI for Distributed Parallel Computing

Parallel Spectral Feature Selection

Computing the Similarity Matrix in Parallel

Parallelization of the Univariate Formulations

Parallel MRSF

Parallel MCSF


Multi-Source Spectral Feature Selection

Categorization of Different Types of Knowledge

A Framework Based on Combining Similarity Matrices

A Framework Based on Rank Aggregation

Experimental Results




About the Authors

Zheng Zhao is a research statistician at the SAS Institute, Inc. His recent research focuses on designing and developing novel analytic approaches for handling large-scale data of extremely high dimensionality. Dr. Zhao is the author of PROC HPREDUCE, which is a SAS High Performance Analytics procedure for large-scale parallel variable selection. He was co-chair of the 2010 PAKDD Workshop on Feature Selection in Data Mining. He earned a Ph.D. in computer science and engineering from Arizona State University.

Huan Liu is a professor of computer science and engineering at Arizona State University. Dr. Liu serves on journal editorial boards and conference program committees and is a founding organizer of the International Conference Series on Social Computing, Behavioral-Cultural Modeling, and Prediction. He earned a Ph.D. in computer science from the University of Southern California. With a focus on data mining, machine learning, social computing, and artificial intelligence, his research investigates problems in real-world application with high-dimensional data of disparate forms, such as social media, group interaction and modeling, data preprocessing, and text/web mining.

About the Series

Chapman & Hall/CRC Data Mining and Knowledge Discovery Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Programming / Games
COMPUTERS / Database Management / Data Mining
COMPUTERS / Machine Theory