Semisupervised Learning for Computational Linguistics: 1st Edition (Hardback) book cover

Semisupervised Learning for Computational Linguistics

1st Edition

By Steven Abney

Chapman and Hall/CRC

320 pages | 97 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781584885597
pub: 2007-09-17
$115.00
x
eBook (VitalSource) : 9780429189241
pub: 2007-09-17
from $28.98


FREE Standard Shipping!

Description

The rapid advancement in the theoretical understanding of statistical and machine learning methods for semisupervised learning has made it difficult for nonspecialists to keep up to date in the field. Providing a broad, accessible treatment of the theory as well as linguistic applications, Semisupervised Learning for Computational Linguistics offers self-contained coverage of semisupervised methods that includes background material on supervised and unsupervised learning.

The book presents a brief history of semisupervised learning and its place in the spectrum of learning methods before moving on to discuss well-known natural language processing methods, such as self-training and co-training. It then centers on machine learning techniques, including the boundary-oriented methods of perceptrons, boosting, support vector machines (SVMs), and the null-category noise model. In addition, the book covers clustering, the expectation-maximization (EM) algorithm, related generative methods, and agreement methods. It concludes with the graph-based method of label propagation as well as a detailed discussion of spectral methods.

Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.

Reviews

"…I would have loved to have had this book when I started working as a computational linguist … The book is well laid out, enjoyable to read, and the formulae aesthetically presented … The book does a very amicable job of being self-contained given the number of subjects and size of the book. I would recommend this book to mathematicians, statisticians, and libraries alike."

CHOICE, February 2009

"However when it works, it works well, and whereas the book provides great breadth, but little depth, it will be a useful springboard for the beginning student."

– Chris J.C. Burges, Microsoft Research, in Journal of the American Statistical Association, June 2009, Vol. 104, No. 486

Table of Contents

INTRODUCTION

A brief history

Semisupervised learning

Organization and assumptions

SELF-TRAINING AND CO-TRAINING

Classification

Self-training

Co-training

APPLICATIONS OF SELF-TRAINING AND CO-TRAINING

Part-of-speech tagging

Information extraction

Parsing

Word senses

CLASSIFICATION

Two simple classifiers

Abstract setting

Evaluating detectors and classifiers that abstain

Binary classifiers and ECOC

MATHEMATICS FOR BOUNDARY-ORIENTED METHODS

Linear separators

The gradient

Constrained optimization

BOUNDARY-ORIENTED METHODS

The perceptron

Game self-teaching

Boosting

Support vector machines (SVMs)

Null-category noise model

CLUSTERING

Cluster and label

Clustering concepts

Hierarchical clustering

Self-training revisited

Graph mincut

Label propagation

Bibliographic notes

GENERATIVE MODELS

Gaussian mixtures

The EM algorithm

AGREEMENT CONSTRAINTS

Co-training

Agreement-based self-teaching

Random fields

Bibliographic notes

PROPAGATION METHODS

Label propagation

Random walks

Harmonic functions

Fluids

Computing the solution

Graph mincuts revisited

Bibliographic notes

MATHEMATICS FOR SPECTRAL METHODS

Some basic concepts

Eigenvalues and eigenvectors

Eigenvalues and the scaling effects of a matrix

Bibliographic notes

SPECTRAL METHODS

Simple harmonic motion

Spectra of matrices and graphs

Spectral clustering

Spectral methods for semisupervised learning

Bibliographic notes

BIBLIOGRAPHY

INDEX

About the Series

Chapman & Hall/CRC Computer Science & Data Analysis

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
BUS061000
BUSINESS & ECONOMICS / Statistics
COM000000
COMPUTERS / General