Computational Methods of Feature Selection: 1st Edition (Hardback) book cover

Computational Methods of Feature Selection

1st Edition

Edited by Huan Liu, Hiroshi Motoda

Chapman and Hall/CRC

440 pages | 91 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781584888789
pub: 2007-10-29
$130.00
x
eBook (VitalSource) : 9780429150418
pub: 2007-10-29
from $28.98


FREE Standard Shipping!

Description

Due to increasing demands for dimensionality reduction, research on feature selection has deeply and widely expanded into many fields, including computational statistics, pattern recognition, machine learning, data mining, and knowledge discovery. Highlighting current research issues, Computational Methods of Feature Selection introduces the basic concepts and principles, state-of-the-art algorithms, and novel applications of this tool.

The book begins by exploring unsupervised, randomized, and causal feature selection. It then reports on some recent results of empowering feature selection, including active feature selection, decision-border estimate, the use of ensembles with independent probes, and incremental feature selection. This is followed by discussions of weighting and local methods, such as the ReliefF family, k-means clustering, local feature relevance, and a new interpretation of Relief. The book subsequently covers text classification, a new feature selection score, and both constraint-guided and aggressive feature selection. The final section examines applications of feature selection in bioinformatics, including feature construction as well as redundancy-, ensemble-, and penalty-based feature selection.

Through a clear, concise, and coherent presentation of topics, this volume systematically covers the key concepts, underlying principles, and inventive applications of feature selection, illustrating how this powerful tool can efficiently harness massive, high-dimensional data and turn it into valuable, reliable information.

Reviews

This book is a really comprehensive review of the modern techniques designed for feature selection in very large datasets. Dozens of algorithms and their comparisons in experiments with synthetic and real data are presented, which can be very helpful to researchers and students working with large data stores.

—Stan Lipovetsky, Technometrics, November 2010

Overall, we enjoyed reading this book. It presents state-of-the-art guidance and tutorials on methodologies and algorithms in computational methods in feature selection. Enhanced by the editors insights, and based on previous work by these leading experts in the field, the book forms another milestone of relevant research and development in feature selection.

—Longbing Cao and David Taniar, IEEE Intelligent Informatics Bulletin, 2008, Vol. 99, No. 99

Table of Contents

PREFACE

Introduction and Background

Less Is More

Huan Liu and Hiroshi Motoda

Background and Basics

Supervised, Unsupervised, and Semi-Supervised Feature Selection

Key Contributions and Organization of the Book

Looking Ahead

Unsupervised Feature Selection

Jennifer G. Dy

Introduction

Clustering

Feature Selection

Feature Selection for Unlabeled Data

Local Approaches

Summary

Randomized Feature Selection

David J. Stracuzzi

Introduction

Types of Randomizations

Randomized Complexity Classes

Applying Randomization to Feature Selection

The Role of Heuristics

Examples of Randomized Selection Algorithms

Issues in Randomization

Summary

Causal Feature Selection

Isabelle Guyon, Constantin Aliferis, and André Elisseeff

Introduction

Classical “Non-Causal” Feature Selection

The Concept of Causality

Feature Relevance in Bayesian Networks

Causal Discovery Algorithms

Examples of Applications

Summary, Conclusions, and Open Problems

Extending Feature Selection

Active Learning of Feature Relevance

Emanuele Olivetti, Sriharsha Veeramachaneni, and Paolo Avesani

Introduction

Active Sampling for Feature Relevance Estimation

Derivation of the Sampling Benefit Function

Implementation of the Active Sampling Algorithm

Experiments

Conclusions and Future Work

A Study of Feature Extraction Techniques Based on Decision Border Estimate

Claudia Diamantini and Domenico Potena

Introduction

Feature Extraction Based on Decision Boundary

Generalities about Labeled Vector Quantizers

Feature Extraction Based on Vector Quantizers

Experiments

Conclusions

Ensemble-Based Variable Selection Using Independent Probes

Eugene Tuv, Alexander Borisov, and Kari Torkkola

Introduction

Tree Ensemble Methods in Feature Ranking

The Algorithm: Ensemble-Based Ranking against Independent Probes

Experiments

Discussion

Efficient Incremental-Ranked Feature Selection in Massive Data

Roberto Ruiz, Jesús S. Aguilar-Ruiz, and José C. Riquelme

Introduction

Related Work

Preliminary Concepts

Incremental Performance over Ranking

Experimental Results

Conclusions

Weighting and Local Methods

Non-Myopic Feature Quality Evaluation with (R)ReliefF

Igor Kononenko and Marko Robnik Šikonja

Introduction

From Impurity to Relief

ReliefF for Classification and RReliefF for Regression

Extensions

Interpretation

Implementation Issues

Applications

Conclusion

Weighting Method for Feature Selection in k-Means

Joshua Zhexue Huang, Jun Xu, Michael Ng, and Yunming Ye

Introduction

Feature Weighting in k-Means

W-k-Means Clustering Algorithm

Feature Selection

Subspace Clustering with k-Means

Text Clustering

Related Work

Discussions

Local Feature Selection for Classification

Carlotta Domeniconi and Dimitrios Gunopulos

Introduction

The Curse of Dimensionality

Adaptive Metric Techniques

Large Margin nearest Neighbor Classifiers

Experimental Comparisons

Conclusions

Feature Weighting through Local Learning

Yijun Sun

Introduction

Mathematical Interpretation of Relief

Iterative Relief Algorithm

Extension to Multiclass Problems

Online Learning

Computational Complexity

Experiments

Conclusion

Text Classification and Clustering

Feature Selection for Text Classification

George Forman

Introduction

Text Feature Generators

Feature Filtering for Classification

Practical and Scalable Computation

A Case Study

Conclusion and Future Work

A Bayesian Feature Selection Score Based on Naïve Bayes Models

Susana Eyheramendy and David Madigan

Introduction

Feature Selection Scores

Classification Algorithms

Experimental Settings and Results

Conclusion

Pairwise Constraints-Guided Dimensionality Reduction

Wei Tang and Shi Zhong

Introduction

Pairwise Constraints-Guided Feature Projection

Pairwise Constraints-Guided Co-Clustering

Experimental Studies

Conclusion and Future Work

Aggressive Feature Selection by Feature Ranking

Masoud Makrehchi and Mohamed S. Kamel

Introduction

Feature Selection by Feature Ranking

Proposed Approach to Reducing Term Redundancy

Experimental Results

Summary

Feature Selection in Bioinformatics

Feature Selection for Genomic Data Analysis

Lei Yu

Introduction

Redundancy-Based Feature Selection

Empirical Study

Summary

A Feature Generation Algorithm with Applications to Biological Sequence Classification

Rezarta Islamaj Dogan, Lise Getoor, and W. John Wilbur

Introduction

Splice-Site Prediction

Feature Generation Algorithm

Experiments and Discussion

Conclusions

An Ensemble Method for Identifying Robust Features for Biomarker Discovery

Diana Chan, Susan M. Bridges, and Shane C. Burgess

Introduction

Biomarker Discovery from Proteome Profiles

Challenges of Biomarker Identification

Ensemble Method for Feature Selection

Feature Selection Ensemble

Results and Discussion

Conclusion

Model Building and Feature Selection with Genomic Data

Hui Zou and Trevor Hastie

Introduction

Ridge Regression, Lasso, and Bridge

Drawbacks of the Lasso

The Elastic Net

The Elastic-Net Penalized SVM

Sparse Eigen-Genes

Summary

INDEX

About the Series

Chapman & Hall/CRC Data Mining and Knowledge Discovery Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
BUS061000
BUSINESS & ECONOMICS / Statistics
COM021030
COMPUTERS / Database Management / Data Mining
MAT021000
MATHEMATICS / Number Systems