496 Pages 168 B/W Illustrations
    by Chapman & Hall

    Discover Novel and Insightful Knowledge from Data Represented as a Graph
    Practical Graph Mining with R presents a "do-it-yourself" approach to extracting interesting patterns from graph data. It covers many basic and advanced techniques for the identification of anomalous or frequently recurring patterns in a graph, the discovery of groups or clusters of nodes that share common patterns of attributes and relationships, the extraction of patterns that distinguish one category of graphs from another, and the use of those patterns to predict the category of new graphs.

    Hands-On Application of Graph Data Mining
    Each chapter in the book focuses on a graph mining task, such as link analysis, cluster analysis, and classification. Through applications using real data sets, the book demonstrates how computational techniques can help solve real-world problems. The applications covered include network intrusion detection, tumor cell diagnostics, face recognition, predictive toxicology, mining metabolic and protein-protein interaction networks, and community detection in social networks.

    Develops Intuition through Easy-to-Follow Examples and Rigorous Mathematical Foundations
    Every algorithm and example is accompanied with R code. This allows readers to see how the algorithmic techniques correspond to the process of graph data analysis and to use the graph mining techniques in practice. The text also gives a rigorous, formal explanation of the underlying mathematics of each technique.

    Makes Graph Mining Accessible to Various Levels of Expertise
    Assuming no prior knowledge of mathematics or data mining, this self-contained book is accessible to students, researchers, and practitioners of graph data mining. It is suitable as a primary textbook for graph mining or as a supplement to a standard data mining course. It can also be used as a reference for researchers in computer, information, and computational science as well as a handy guide for data analytics practitioners.

    Introduction Kanchana Padmanabhan, William Hendrix, and Nagiza F. Samatova
    Graph Mining Applications
    Book Structure

    An Introduction to Graph Theory Stephen Ware
    What Is a Graph?
    Vertices and Edges
    Comparing Graphs
    Directed Graphs
    Families of Graphs
    Weighted Graphs
    Graph Representations

    An Introduction to R Neil Shah
    What Is R?
    What Can R Do?
    R Packages
    Why Use R?
    Common R Functions
    R Installation

    An Introduction to Kernel Functions John Jenkins
    Kernel Methods on Vector Data
    Extending Kernel Methods to Graphs
    Choosing Suitable Graph Kernel Functions
    Kernels in This Book

    Link Analysis Arpan Chakraborty, Kevin Wilson, Nathan Green, Shravan Kumar Alur, Fatih Ergin, Karthik Gurumurthy, Romulo Manzano, and Deepti Chinta
    Analyzing Links
    Metrics for Analyzing Networks
    The PageRank Algorithm
    Hyperlink-Induced Topic Search (HITS)
    Link Prediction

    Graph-Based Proximity Measures Kevin A. Wilson, Nathan D. Green, Laxmikant Agrawal, Xibin Gao, Dinesh Madhusoodanan, Brian Riley, and James P. Sigmon
    Defining the Proximity of Vertices in Graphs
    Evaluating Relatedness Using Neumann Kernels

    Frequent Subgraph Mining Brent E. Harrison, Jason C. Smith, Stephen G. Ware, Hsiao-Wei Chen, Wenbin Chen, and Anjali Khatri
    About Frequent Subgraph Mining
    The gSpan Algorithm
    The SUBDUE Algorithm
    Mining Frequent Subtrees with SLEUTH

    Cluster Analysis Kanchana Padmanabhan, Brent Harrison, Kevin Wilson, Michael L. Warren, Katie Bright, Justin Mosiman, Jayaram Kancherla, Hieu Phung, Benjamin Miller, and Sam Shamseldin
    Minimum Spanning Tree Clustering
    Shared Nearest Neighbor Clustering
    Betweenness Centrality Clustering
    Highly Connected Subgraph Clustering
    Maximal Clique Enumeration
    Clustering Vertices with Kernel k-Means
    How to Choose a Clustering Technique

    Classification Srinath Ravindran, John Jenkins, Huseyin Sencan, Jay Prakash Goel, Saee Nirgude, Kalindi K. Raichura, Suchetha M. Reddy, and Jonathan S. Tatagiri
    Overview of Classification
    Classifcation of Vector Data: Support Vector Machines 
    Classifying Graphs and Vertices

    Dimensionality Reduction Madhuri R. Marri, Lakshmi Ramachandran, Pradeep Murukannaiah, Padmashree Ravindra, Amrita Paul, Da Young Lee, David Funk, Shanmugapriya Murugappan, and William Hendrix
    Multidimensional Scaling
    Kernel Principal Component Analysis
    Linear Discriminant Analysis

    Graph-Based Anomaly Detection Kanchana Padmanabhan, Zhengzhang Chen, Sriram Lakshminarasimhan, Siddarth Shankar Ramaswamy, and Bryan Thomas Richardson
    Types of Anomalies
    Random Walk Algorithm
    GBAD Algorithm
    Tensor-Based Anomaly Detection Algorithm

    Performance Metrics for Graph Mining Tasks Kanchana Padmanabhan and John Jenkins
    Supervised Learning Performance Metrics
    Unsupervised Learning Performance Metrics
    Optimizing Metrics
    Statistical Significance Techniques
    Model Comparison
    Handling the Class Imbalance Problem in Supervised Learning
    Other Issues
    Application Domain-Specific Measures

    Introduction to Parallel Graph Mining William Hendrix, Mekha Susan Varghese, Nithya Natesan, Kaushik Tirukarugavur Srinivasan, Vinu Balajee, and Yu Ren
    Parallel Computing Overview
    Embarassingly Parallel Computation
    Calling Parallel Codes in R
    Creating Parallel Codes in R Using Rmpi
    Practical Issues in Parallel Programming


    Exercises and Bibliography appear at the end of each chapter.


    Nagiza F. Samatova is an associate professor of computer science at North Carolina State University and a senior research scientist at Oak Ridge National Laboratory.

    "The authors provide a tour de force introduction to the different data representations (vectors, matrices), and introduce graph structures and the questions that can be answered with them. ... The book has many strong points. There is a companion website that hosts slide presentations for almost all chapters, as well the R code needed to run the example code. The impatient reader can start going through the presentations and experimenting with the code right away. The more patient reader can read the book from cover to cover. For many reader categories, this summary of existing relevant work and approaches for data mining graph structures is a welcome addition, for which the authors deserves much praise."
    --Radu State, Computing Reviews