Computational Methods of Feature Selection  book cover
1st Edition

Computational Methods of Feature Selection

ISBN 9781584888789
Published October 29, 2007 by Chapman and Hall/CRC
440 Pages 91 B/W Illustrations

FREE Standard Shipping
USD $150.00

Prices & shipping based on shipping country


Book Description

Due to increasing demands for dimensionality reduction, research on feature selection has deeply and widely expanded into many fields, including computational statistics, pattern recognition, machine learning, data mining, and knowledge discovery. Highlighting current research issues, Computational Methods of Feature Selection introduces the basic concepts and principles, state-of-the-art algorithms, and novel applications of this tool.

The book begins by exploring unsupervised, randomized, and causal feature selection. It then reports on some recent results of empowering feature selection, including active feature selection, decision-border estimate, the use of ensembles with independent probes, and incremental feature selection. This is followed by discussions of weighting and local methods, such as the ReliefF family, k-means clustering, local feature relevance, and a new interpretation of Relief. The book subsequently covers text classification, a new feature selection score, and both constraint-guided and aggressive feature selection. The final section examines applications of feature selection in bioinformatics, including feature construction as well as redundancy-, ensemble-, and penalty-based feature selection.

Through a clear, concise, and coherent presentation of topics, this volume systematically covers the key concepts, underlying principles, and inventive applications of feature selection, illustrating how this powerful tool can efficiently harness massive, high-dimensional data and turn it into valuable, reliable information.

Table of Contents

Introduction and Background
Less Is More
Huan Liu and Hiroshi Motoda
Background and Basics
Supervised, Unsupervised, and Semi-Supervised Feature Selection
Key Contributions and Organization of the Book
Looking Ahead
Unsupervised Feature Selection
Jennifer G. Dy
Feature Selection
Feature Selection for Unlabeled Data
Local Approaches
Randomized Feature Selection
David J. Stracuzzi
Types of Randomizations
Randomized Complexity Classes
Applying Randomization to Feature Selection
The Role of Heuristics
Examples of Randomized Selection Algorithms
Issues in Randomization
Causal Feature Selection
Isabelle Guyon, Constantin Aliferis, and André Elisseeff
Classical “Non-Causal” Feature Selection
The Concept of Causality
Feature Relevance in Bayesian Networks
Causal Discovery Algorithms
Examples of Applications
Summary, Conclusions, and Open Problems
Extending Feature Selection
Active Learning of Feature Relevance
Emanuele Olivetti, Sriharsha Veeramachaneni, and Paolo Avesani
Active Sampling for Feature Relevance Estimation
Derivation of the Sampling Benefit Function
Implementation of the Active Sampling Algorithm
Conclusions and Future Work
A Study of Feature Extraction Techniques Based on Decision Border Estimate
Claudia Diamantini and Domenico Potena
Feature Extraction Based on Decision Boundary
Generalities about Labeled Vector Quantizers
Feature Extraction Based on Vector Quantizers
Ensemble-Based Variable Selection Using Independent Probes
Eugene Tuv, Alexander Borisov, and Kari Torkkola
Tree Ensemble Methods in Feature Ranking
The Algorithm: Ensemble-Based Ranking against Independent Probes
Efficient Incremental-Ranked Feature Selection in Massive Data
Roberto Ruiz, Jesús S. Aguilar-Ruiz, and José C. Riquelme
Related Work
Preliminary Concepts
Incremental Performance over Ranking
Experimental Results
Weighting and Local Methods
Non-Myopic Feature Quality Evaluation with (R)ReliefF
Igor Kononenko and Marko Robnik Šikonja
From Impurity to Relief
ReliefF for Classification and RReliefF for Regression
Implementation Issues
Weighting Method for Feature Selection in k-Means
Joshua Zhexue Huang, Jun Xu, Michael Ng, and Yunming Ye
Feature Weighting in k-Means
W-k-Means Clustering Algorithm
Feature Selection
Subspace Clustering with k-Means
Text Clustering
Related Work
Local Feature Selection for Classification
Carlotta Domeniconi and Dimitrios Gunopulos
The Curse of Dimensionality
Adaptive Metric Techniques
Large Margin nearest Neighbor Classifiers
Experimental Comparisons
Feature Weighting through Local Learning
Yijun Sun
Mathematical Interpretation of Relief
Iterative Relief Algorithm
Extension to Multiclass Problems
Online Learning
Computational Complexity
Text Classification and Clustering
Feature Selection for Text Classification
George Forman
Text Feature Generators
Feature Filtering for Classification
Practical and Scalable Computation
A Case Study
Conclusion and Future Work
A Bayesian Feature Selection Score Based on Naïve Bayes Models
Susana Eyheramendy and David Madigan
Feature Selection Scores
Classification Algorithms
Experimental Settings and Results
Pairwise Constraints-Guided Dimensionality Reduction
Wei Tang and Shi Zhong
Pairwise Constraints-Guided Feature Projection
Pairwise Constraints-Guided Co-Clustering
Experimental Studies
Conclusion and Future Work
Aggressive Feature Selection by Feature Ranking
Masoud Makrehchi and Mohamed S. Kamel
Feature Selection by Feature Ranking
Proposed Approach to Reducing Term Redundancy
Experimental Results
Feature Selection in Bioinformatics
Feature Selection for Genomic Data Analysis
Lei Yu
Redundancy-Based Feature Selection
Empirical Study
A Feature Generation Algorithm with Applications to Biological Sequence Classification
Rezarta Islamaj Dogan, Lise Getoor, and W. John Wilbur
Splice-Site Prediction
Feature Generation Algorithm
Experiments and Discussion
An Ensemble Method for Identifying Robust Features for Biomarker Discovery
Diana Chan, Susan M. Bridges, and Shane C. Burgess
Biomarker Discovery from Proteome Profiles
Challenges of Biomarker Identification
Ensemble Method for Feature Selection
Feature Selection Ensemble
Results and Discussion
Model Building and Feature Selection with Genomic Data
Hui Zou and Trevor Hastie
Ridge Regression, Lasso, and Bridge
Drawbacks of the Lasso
The Elastic Net
The Elastic-Net Penalized SVM
Sparse Eigen-Genes

View More


This book is a really comprehensive review of the modern techniques designed for feature selection in very large datasets. Dozens of algorithms and their comparisons in experiments with synthetic and real data are presented, which can be very helpful to researchers and students working with large data stores.
—Stan Lipovetsky, Technometrics, November 2010

Overall, we enjoyed reading this book. It presents state-of-the-art guidance and tutorials on methodologies and algorithms in computational methods in feature selection. Enhanced by the editors insights, and based on previous work by these leading experts in the field, the book forms another milestone of relevant research and development in feature selection.
—Longbing Cao and David Taniar, IEEE Intelligent Informatics Bulletin, 2008, Vol. 99, No. 99