1st Edition

Probabilistic Foundations of Statistical Network Analysis

ISBN 9781138630154
Published April 19, 2018 by Chapman and Hall/CRC
256 Pages

USD $51.95

Prices & shipping based on shipping country


Book Description

Probabilistic Foundations of Statistical Network Analysis presents a fresh and insightful perspective on the fundamental tenets and major challenges of modern network analysis. Its lucid exposition provides necessary background for understanding the essential ideas behind exchangeable and dynamic network models, network sampling, and network statistics such as sparsity and power law, all of which play a central role in contemporary data science and machine learning applications. The book rewards readers with a clear and intuitive understanding of the subtle interplay between basic principles of statistical inference, empirical properties of network data, and technical concepts from probability theory. Its mathematically rigorous, yet non-technical, exposition makes the book accessible to professional data scientists, statisticians, and computer scientists as well as practitioners and researchers in substantive fields. Newcomers and non-quantitative researchers will find its conceptual approach invaluable for developing intuition about technical ideas from statistics and probability, while experts and graduate students will find the book a handy reference for a wide range of new topics, including edge exchangeability, relative exchangeability, graphon and graphex models, and graph-valued Levy process and rewiring models for dynamic networks.

The author’s incisive commentary supplements these core concepts, challenging the reader to push beyond the current limitations of this emerging discipline. With an approachable exposition and more than 50 open research problems and exercises with solutions, this book is ideal for advanced undergraduate and graduate students interested in modern network analysis, data science, machine learning, and statistics.

Harry Crane is Associate Professor and Co-Director of the Graduate Program in Statistics and Biostatistics and an Associate Member of the Graduate Faculty in Philosophy at Rutgers University. Professor Crane’s research interests cover a range of mathematical and applied topics in network science, probability theory, statistical inference, and mathematical logic. In addition to his technical work on edge and relational exchangeability, relative exchangeability, and graph-valued Markov processes, Prof. Crane’s methods have been applied to domain-specific cybersecurity and counterterrorism problems at the Foreign Policy Research Institute and RAND’s Project AIR FORCE.







Table of Contents




  1. Orientation
  2. Analogy: Bernoulli trials

    What it is: Graphs vs Networks

    Moving beyond graphs

    How to look at it: Labeling and representation

    Where it comes from: Context

    Making sense of it all: Coherence

    What we’re talking about: Common examples of network data


    Social networks

    Karate club

    Enron email corpus

    Collaboration networks

    Other networks

    Some common scenarios

    Major Open Questions


    Modeling network complexity

    Sampling issues

    Modeling temporal variation

    Chapter synopses and reading guide

    Binary relational data

    Network sampling

    Generative models

    Statistical modeling paradigm

    Vertex exchangeable models

    Getting beyond graphons

    Relatively exchangeable models

    Edge exchangeable models

    Relationally exchangeable models


    Dynamic network models


  3. Binary relational data
  4. Scenario: Patterns in international trade

    Summarizing network structure

    Dyad independence model

    Exponential random graph models (ERGMs)

    Scenario: Friendships in a high school

    Network inference under sampling

    Further reading



  5. Network sampling
  6. Opening example

    Consistency under selection

    Consistency in the p model

    Significance of sampling consistency

    Toward a coherent framework of network modeling

    Selection from sparse networks

    Scenario: Ego networks in high school friendships

    Network sampling schemes

    Relational sampling

    Edge sampling

    Hyperedge sampling

    Path sampling

    Snowball sampling

    Units of observation

    What is the sample size?

    Consistency under subsampling

    Further reading

  7. Generative models
  8. Specification of generative models

    Preferential Attachment model

    Random walk models

    Erd˝os–R´enyi–Gilbert model

    General sequential construction

    Further reading

  9. Statistical modeling paradigm
  10. The quest for coherence

    An incoherent model

    What is a statistical model?

    Population model

    Finite sample models


    Coherence in sampling models

    Coherence in generative models

    Statistical implications of coherence


    Erd˝os–R´enyi–Gilbert model under selection sampling

    ERGM with selection sampling

    Erd˝os–R´enyi–Gilbert model under edge sampling

    Invariance principles

    Further reading

  11. Vertex exchangeable models
  12. Preliminaries: Formal definition of exchangeability

    Implications of exchangeability

    Finite exchangeable random graphs

    Exchangeable ERGMs

    Countable exchangeable models

    Graphon models

    Generative model

    Exchangeability of graphon models

    Aldous–Hoover theorem

    Graphons and vertex exchangeability

    Subsampling description

    Viability of graphon models

    Implication: Dense structure

    Implication: Representative sampling

    The emergence of graphons

    Potential benefits of graphon models

    Connection to de Finetti’s theorem

    Graphon estimation

    Further reading

  13. Getting beyond graphons
  14. Something must go

    Sparse graphon models

    Completely random measures and graphex models

    Scenario: Formation of Facebook friendships

    Network representation

    Interpretation of vertex labels

    Exchangeable point process models

    Graphex representation

    Sampling context

    Further discussion

    Variants of invariance

    Relatively exchangeable models


    Edge exchangeable models

    Relationally exchangeable models

  15. Relatively exchangeable models
  16. Scenario: heterogeneity in social networks

    Stochastic blockmodels

    Generalized blockmodels

    Community detection and Bayesian versions of SBM

    Beyond SBMs and community detection

    Relative exchangeability with respect to another network

    Scenario: high school social network revisited

    Exchangeability relative to a social network

    Lack of interference

    Label equivariance

    Latent space models

    Relatively exchangeable random graphs

    Relatively exchangeable f-processes

    Relative exchangeability under arbitrary sampling

    Final remarks and further reading

  17. Edge exchangeable models
  18. Scenario: Monitoring phone calls

    Edge-centric view

    Edge exchangeability

    Interaction propensity process

    Characterizing edge exchangeable random graphs

    Vertex components models

    Stick-breaking constructions for vertex components

    Hollywood model

    The Hollywood process

    Role of parameters in the Hollywood model

    Statistical properties of the Hollywood model

    Prediction from the Hollywood model


    Contexts for edge sampling

    Concluding remarks

    Connection to graphex models

    Further reading

  19. Relationally exchangeable models
  20. Sampling multiway interactions (hyperedges)

    Collaboration networks

    Coauthorship networks

    Representing multiway interaction networks

    Hyperedge exchangeability

    Interaction propensity process

    Characterization for hyperedge exchangeable networks

    Scenario: Traceroute sampling of Internet topology

    Representing the data

    Path exchangeability

    Relational exchangeability

    General Hollywood model

    Markovian vertex components models

    Concluding remarks and further reading

  21. Dynamic network models

Scenario: Dynamics in social media activity

Modeling considerations

Network dynamics: Markov property

Modeling the initial state

Is the Markov property a good assumption?

Temporal Exponential Random Graph Model (TERGM)

Projectivity and sampling

Example: a TERGM for triangle counts

Projective Markov property

Rewiring chains and Markovian graphons

Exchangeable rewiring processes (Markovian graphons)

Graph-valued L´evy processes

Inference from graph-valued L´evy processes

Continuous time processes

Poissonian construction

Further reading



View More


"I believe this book can serve both as a reference and textbook, but primarily should be seen as a textbook for a course built around foundational aspects of statistical modeling for network data. Most prior texts I am aware of focus on statistical methods within existing network models. I really like that this book helps the reader understand the statistical implications of choice of model, both in terms of "coherence" and sampling. Most prior work presents the field of statistical network analysis as a basket of models from which one chooses their preferred method. Crane takes a more foundational approach - showing how choice of model leads to implicit statistical assumptions that too often go unspoken."
~Walter Dempsey, Harvard University

"A set of useful exercises are given in almost all chapters that assists in understanding the topics and – what is very useful and much appreciated – the author also gives their solutions. These are not only a great tool because they allow solutions to be checked, but because somehow they are a complement of the text. Moreover, they provide the opportunity to dive thoroughly into the topics. Finally, the author not only proposes these exercises in each chapter, he also proposes problems that are open research questions. These are very nice inputs for researchers who are working in the field. And in this way, the author opens a door to further research and establishes a dialog between him and the
~Silvano Romano, ISCB Newsletter