Discover Novel and Insightful Knowledge from Data Represented as a Graph
Practical Graph Mining with R presents a "do-it-yourself" approach to extracting interesting patterns from graph data. It covers many basic and advanced techniques for the identification of anomalous or frequently recurring patterns in a graph, the discovery of groups or clusters of nodes that share common patterns of attributes and relationships, the extraction of patterns that distinguish one category of graphs from another, and the use of those patterns to predict the category of new graphs.
Hands-On Application of Graph Data Mining
Each chapter in the book focuses on a graph mining task, such as link analysis, cluster analysis, and classification. Through applications using real data sets, the book demonstrates how computational techniques can help solve real-world problems. The applications covered include network intrusion detection, tumor cell diagnostics, face recognition, predictive toxicology, mining metabolic and protein-protein interaction networks, and community detection in social networks.
Develops Intuition through Easy-to-Follow Examples and Rigorous Mathematical Foundations
Every algorithm and example is accompanied with R code. This allows readers to see how the algorithmic techniques correspond to the process of graph data analysis and to use the graph mining techniques in practice. The text also gives a rigorous, formal explanation of the underlying mathematics of each technique.
Makes Graph Mining Accessible to Various Levels of Expertise
Assuming no prior knowledge of mathematics or data mining, this self-contained book is accessible to students, researchers, and practitioners of graph data mining. It is suitable as a primary textbook for graph mining or as a supplement to a standard data mining course. It can also be used as a reference for researchers in computer, information, and computational science as well as a handy guide for data analytics practitioners.
Table of Contents
Introduction Kanchana Padmanabhan, William Hendrix, and Nagiza F. Samatova
Graph Mining Applications
An Introduction to Graph Theory Stephen Ware
What Is a Graph?
Vertices and Edges
Families of Graphs
An Introduction to R Neil Shah
What Is R?
What Can R Do?
Why Use R?
Common R Functions
An Introduction to Kernel Functions John Jenkins
Kernel Methods on Vector Data
Extending Kernel Methods to Graphs
Choosing Suitable Graph Kernel Functions
Kernels in This Book
Link Analysis Arpan Chakraborty, Kevin Wilson, Nathan Green, Shravan Kumar Alur, Fatih Ergin, Karthik Gurumurthy, Romulo Manzano, and Deepti Chinta
Metrics for Analyzing Networks
The PageRank Algorithm
Hyperlink-Induced Topic Search (HITS)
Graph-Based Proximity Measures Kevin A. Wilson, Nathan D. Green, Laxmikant Agrawal, Xibin Gao, Dinesh Madhusoodanan, Brian Riley, and James P. Sigmon
Defining the Proximity of Vertices in Graphs
Evaluating Relatedness Using Neumann Kernels
Frequent Subgraph Mining Brent E. Harrison, Jason C. Smith, Stephen G. Ware, Hsiao-Wei Chen, Wenbin Chen, and Anjali Khatri
About Frequent Subgraph Mining
The gSpan Algorithm
The SUBDUE Algorithm
Mining Frequent Subtrees with SLEUTH
Cluster Analysis Kanchana Padmanabhan, Brent Harrison, Kevin Wilson, Michael L. Warren, Katie Bright, Justin Mosiman, Jayaram Kancherla, Hieu Phung, Benjamin Miller, and Sam Shamseldin
Minimum Spanning Tree Clustering
Shared Nearest Neighbor Clustering
Betweenness Centrality Clustering
Highly Connected Subgraph Clustering
Maximal Clique Enumeration
Clustering Vertices with Kernel k-Means
How to Choose a Clustering Technique
Classification Srinath Ravindran, John Jenkins, Huseyin Sencan, Jay Prakash Goel, Saee Nirgude, Kalindi K. Raichura, Suchetha M. Reddy, and Jonathan S. Tatagiri
Overview of Classification
Classifcation of Vector Data: Support Vector Machines
Classifying Graphs and Vertices
Dimensionality Reduction Madhuri R. Marri, Lakshmi Ramachandran, Pradeep Murukannaiah, Padmashree Ravindra, Amrita Paul, Da Young Lee, David Funk, Shanmugapriya Murugappan, and William Hendrix
Kernel Principal Component Analysis
Linear Discriminant Analysis
Graph-Based Anomaly Detection Kanchana Padmanabhan, Zhengzhang Chen, Sriram Lakshminarasimhan, Siddarth Shankar Ramaswamy, and Bryan Thomas Richardson
Types of Anomalies
Random Walk Algorithm
Tensor-Based Anomaly Detection Algorithm
Performance Metrics for Graph Mining Tasks Kanchana Padmanabhan and John Jenkins
Supervised Learning Performance Metrics
Unsupervised Learning Performance Metrics
Statistical Significance Techniques
Handling the Class Imbalance Problem in Supervised Learning
Application Domain-Specific Measures
Introduction to Parallel Graph Mining William Hendrix, Mekha Susan Varghese, Nithya Natesan, Kaushik Tirukarugavur Srinivasan, Vinu Balajee, and Yu Ren
Parallel Computing Overview
Embarassingly Parallel Computation
Calling Parallel Codes in R
Creating Parallel Codes in R Using Rmpi
Practical Issues in Parallel Programming
Exercises and Bibliography appear at the end of each chapter.
Nagiza F. Samatova is an associate professor of computer science at North Carolina State University and a senior research scientist at Oak Ridge National Laboratory.
"The authors provide a tour de force introduction to the different data representations (vectors, matrices), and introduce graph structures and the questions that can be answered with them. ... The book has many strong points. There is a companion website that hosts slide presentations for almost all chapters, as well the R code needed to run the example code. The impatient reader can start going through the presentations and experimenting with the code right away. The more patient reader can read the book from cover to cover. For many reader categories, this summary of existing relevant work and approaches for data mining graph structures is a welcome addition, for which the authors deserves much praise."
--Radu State, Computing Reviews