Clustering in Bioinformatics and Drug Discovery: 1st Edition (Hardback) book cover

Clustering in Bioinformatics and Drug Discovery

1st Edition

By John David MacCuish, Norah E. MacCuish

CRC Press

244 pages | 63 B/W Illus.

Purchasing Options:$ = USD
New in Paperback: 9781138374232
pub: 2019-07-02
SAVE ~$14.20
$71.00
$56.80
x
Hardback: 9781439816783
pub: 2010-11-15
SAVE ~$21.00
$105.00
$84.00
x
eBook (VitalSource) : 9780429131301
pub: 2010-11-15
from $35.50


FREE Standard Shipping!

Description

With a DVD of color figures, Clustering in Bioinformatics and Drug Discovery provides an expert guide on extracting the most pertinent information from pharmaceutical and biomedical data. It offers a concise overview of common and recent clustering methods used in bioinformatics and drug discovery.

Setting the stage for subsequent material, the first three chapters of the book introduce statistical learning theory, exploratory data analysis, clustering algorithms, different types of data, graph theory, and various clustering forms. In the following chapters on partitional, cluster sampling, and hierarchical algorithms, the book provides readers with enough detail to obtain a basic understanding of cluster analysis for bioinformatics and drug discovery. The remaining chapters cover more advanced methods, such as hybrid and parallel algorithms, as well as details related to specific types of data, including asymmetry, ambiguity, validation measures, and visualization.

This book explores the application of cluster analysis in the areas of bioinformatics and cheminformatics as they relate to drug discovery. Clarifying the use and misuse of clustering methods, it helps readers understand the relative merits of these methods and evaluate results so that useful hypotheses can be developed and tested.

Reviews

John trained in computer science and has been involved with data mining and statistical analysis; Norah trained as a theoretical physical chemist and has mostly worked for pharmaceutical companies on drug discovery. They run a company that merges their fields, and it is that overlap that they describe here. They explain how cluster analysis, an exploratory data analysis tool, is used in bioinformatics and cheminformatics as they relate to drug discovery. The goal is for practitioners to be aware of the relative merits of clustering methods with the data they have at hand.

SciTech Book News, February 2011

… In this volume, the authors present sufficient options so that the user can choose the appropriate method for their data. … Practitioners in the pharmaceutical industry need an expert guide, which the authors of this book provide, to extract the most information from their data. Those of us who learned their clustering from Anderberg, Sokal and Sneath, and Willett now have a valuable additional resource suitable for the 21st century.

—From the Foreword by John Bradshaw, Barley, Hertfordshire, UK

Table of Contents

Introduction

History

Bioinformatics and Drug Discovery

Statistical Learning Theory and Exploratory Data Analysis

Clustering Algorithms

Computational Complexity

Data

Types

Normalization and Scaling

Transformations

Formats

Data Matrices

Measures of Similarity

Proximity Matrices

Symmetric Matrices

Dimensionality, Components, Discriminants

Graph Theory

Clustering Forms

Partitional

Hierarchical

Mixture Models

Sampling

Overlapping

Fuzzy

Self-Organizing

Hybrids

Partitional Algorithms

K-Means

Jarvis–Patrick

Spectral Clustering

Self-Organizing Maps

Cluster Sampling Algorithms

Leader Algorithms

Taylor–Butina Algorithm

Hierarchical Algorithms

Agglomerative

Divisive

Hybrid Algorithms

Self-Organizing Tree Algorithm

Divisive Hierarchical K-Means

Exclusion Region Hierarchies

Biclustering

Asymmetry

Measures

Algorithms

Ambiguity

Discrete Valued Data Types

Precision

Ties in Proximity

Measure Probability and Distributions

Algorithm Decision Ambiguity

Overlapping Clustering Algorithms Based on Ambiguity

Validation

Validation Measures

Visualization

Example

Large Scale and Parallel Algorithms

Leader and Leader-Follower Algorithms

Taylor–Butina

K-Means and Variants

Examples

Appendices

Bibliography

A Glossary and Exercises appear at the end of each chapter.

About the Authors

John D. MacCuish is the founder and president of Mesa Analytics & Computing, Inc. He has co-authored several software patents and has worked on many image processing, data mining, and statistical modeling applications, including IRS fraud detection, credit card fraud detection, and automated reasoning systems for drug discovery.

Norah E. MacCuish is the chief science officer of Mesa Analytics & Computing, Inc., where she acts as a consultant in the areas of drug design and compound acquisition and as a developer of commercial chemical information software products. She earned her Ph.D. in theoretical physical chemistry from Cornell University.

About the Series

Chapman & Hall/CRC Mathematical and Computational Biology

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
MAT029000
MATHEMATICS / Probability & Statistics / General
MED072000
MEDICAL / Pharmacy
SCI008000
SCIENCE / Life Sciences / Biology / General