1st Edition

Computational Methods of Feature Selection

440 Pages 91 B/W Illustrations

by Chapman & Hall

440 Pages

by Chapman & Hall

Learn about VitalSource eBooks

Also available as eBook on:

Taylor & Francis eBooks
(Institutional Purchase)Opens in new tab or window

Description

Due to increasing demands for dimensionality reduction, research on feature selection has deeply and widely expanded into many fields, including computational statistics, pattern recognition, machine learning, data mining, and knowledge discovery. Highlighting current research issues, Computational Methods of Feature Selection introduces the basic concepts and principles, state-of-the-art algorithms, and novel applications of this tool.

The book begins by exploring unsupervised, randomized, and causal feature selection. It then reports on some recent results of empowering feature selection, including active feature selection, decision-border estimate, the use of ensembles with independent probes, and incremental feature selection. This is followed by discussions of weighting and local methods, such as the ReliefF family, k-means clustering, local feature relevance, and a new interpretation of Relief. The book subsequently covers text classification, a new feature selection score, and both constraint-guided and aggressive feature selection. The final section examines applications of feature selection in bioinformatics, including feature construction as well as redundancy-, ensemble-, and penalty-based feature selection.

Through a clear, concise, and coherent presentation of topics, this volume systematically covers the key concepts, underlying principles, and inventive applications of feature selection, illustrating how this powerful tool can efficiently harness massive, high-dimensional data and turn it into valuable, reliable information.

PREFACE
Introduction and Background
Less Is More
Huan Liu and Hiroshi Motoda
Background and Basics
Supervised, Unsupervised, and Semi-Supervised Feature Selection
Key Contributions and Organization of the Book
Looking Ahead
Unsupervised Feature Selection
Jennifer G. Dy
Introduction
Clustering
Feature Selection
Feature Selection for Unlabeled Data
Local Approaches
Summary
Randomized Feature Selection
David J. Stracuzzi
Introduction
Types of Randomizations
Randomized Complexity Classes
Applying Randomization to Feature Selection
The Role of Heuristics
Examples of Randomized Selection Algorithms
Issues in Randomization
Summary
Causal Feature Selection
Isabelle Guyon, Constantin Aliferis, and André Elisseeff
Introduction
Classical “Non-Causal” Feature Selection
The Concept of Causality
Feature Relevance in Bayesian Networks
Causal Discovery Algorithms
Examples of Applications
Summary, Conclusions, and Open Problems
Extending Feature Selection
Active Learning of Feature Relevance
Emanuele Olivetti, Sriharsha Veeramachaneni, and Paolo Avesani
Introduction
Active Sampling for Feature Relevance Estimation
Derivation of the Sampling Benefit Function
Implementation of the Active Sampling Algorithm
Experiments
Conclusions and Future Work
A Study of Feature Extraction Techniques Based on Decision Border Estimate
Claudia Diamantini and Domenico Potena
Introduction
Feature Extraction Based on Decision Boundary
Generalities about Labeled Vector Quantizers
Feature Extraction Based on Vector Quantizers
Experiments
Conclusions
Ensemble-Based Variable Selection Using Independent Probes
Eugene Tuv, Alexander Borisov, and Kari Torkkola
Introduction
Tree Ensemble Methods in Feature Ranking
The Algorithm: Ensemble-Based Ranking against Independent Probes
Experiments
Discussion
Efficient Incremental-Ranked Feature Selection in Massive Data
Roberto Ruiz, Jesús S. Aguilar-Ruiz, and José C. Riquelme
Introduction
Related Work
Preliminary Concepts
Incremental Performance over Ranking
Experimental Results
Conclusions
Weighting and Local Methods
Non-Myopic Feature Quality Evaluation with (R)ReliefF
Igor Kononenko and Marko Robnik Šikonja
Introduction
From Impurity to Relief
ReliefF for Classification and RReliefF for Regression
Extensions
Interpretation
Implementation Issues
Applications
Conclusion
Weighting Method for Feature Selection in k-Means
Joshua Zhexue Huang, Jun Xu, Michael Ng, and Yunming Ye
Introduction
Feature Weighting in k-Means
W-k-Means Clustering Algorithm
Feature Selection
Subspace Clustering with k-Means
Text Clustering
Related Work
Discussions
Local Feature Selection for Classification
Carlotta Domeniconi and Dimitrios Gunopulos
Introduction
The Curse of Dimensionality
Adaptive Metric Techniques
Large Margin nearest Neighbor Classifiers
Experimental Comparisons
Conclusions
Feature Weighting through Local Learning
Yijun Sun
Introduction
Mathematical Interpretation of Relief
Iterative Relief Algorithm
Extension to Multiclass Problems
Online Learning
Computational Complexity
Experiments
Conclusion
Text Classification and Clustering
Feature Selection for Text Classification
George Forman
Introduction
Text Feature Generators
Feature Filtering for Classification
Practical and Scalable Computation
A Case Study
Conclusion and Future Work
A Bayesian Feature Selection Score Based on Naïve Bayes Models
Susana Eyheramendy and David Madigan
Introduction
Feature Selection Scores
Classification Algorithms
Experimental Settings and Results
Conclusion
Pairwise Constraints-Guided Dimensionality Reduction
Wei Tang and Shi Zhong
Introduction
Pairwise Constraints-Guided Feature Projection
Pairwise Constraints-Guided Co-Clustering
Experimental Studies
Conclusion and Future Work
Aggressive Feature Selection by Feature Ranking
Masoud Makrehchi and Mohamed S. Kamel
Introduction
Feature Selection by Feature Ranking
Proposed Approach to Reducing Term Redundancy
Experimental Results
Summary
Feature Selection in Bioinformatics
Feature Selection for Genomic Data Analysis
Lei Yu
Introduction
Redundancy-Based Feature Selection
Empirical Study
Summary
A Feature Generation Algorithm with Applications to Biological Sequence Classification
Rezarta Islamaj Dogan, Lise Getoor, and W. John Wilbur
Introduction
Splice-Site Prediction
Feature Generation Algorithm
Experiments and Discussion
Conclusions
An Ensemble Method for Identifying Robust Features for Biomarker Discovery
Diana Chan, Susan M. Bridges, and Shane C. Burgess
Introduction
Biomarker Discovery from Proteome Profiles
Challenges of Biomarker Identification
Ensemble Method for Feature Selection
Feature Selection Ensemble
Results and Discussion
Conclusion
Model Building and Feature Selection with Genomic Data
Hui Zou and Trevor Hastie
Introduction
Ridge Regression, Lasso, and Bridge
Drawbacks of the Lasso
The Elastic Net
The Elastic-Net Penalized SVM
Sparse Eigen-Genes
Summary
INDEX

Editor(s)

Biography

Huan Liu, Hiroshi Motoda

Critics' Reviews

This book is a really comprehensive review of the modern techniques designed for feature selection in very large datasets. Dozens of algorithms and their comparisons in experiments with synthetic and real data are presented, which can be very helpful to researchers and students working with large data stores.
—Stan Lipovetsky, Technometrics, November 2010

Overall, we enjoyed reading this book. It presents state-of-the-art guidance and tutorials on methodologies and algorithms in computational methods in feature selection. Enhanced by the editors insights, and based on previous work by these leading experts in the field, the book forms another milestone of relevant research and development in feature selection.
—Longbing Cao and David Taniar, IEEE Intelligent Informatics Bulletin, 2008, Vol. 99, No. 99

Add to Cart

Computational Methods of Feature Selection

Description

Table of Contents

Editor(s)

Biography

Critics' Reviews