1st Edition

An Introduction to Spatial Data Science with GeoDa Volume 1 and 2

    696 Pages 485 Color & 74 B/W Illustrations
    by Chapman & Hall

    This book is the first in a two-volume series that introduces the field of spatial data science. It offers an accessible overview of the methodology of exploratory spatial data analysis. It also constitutes the definitive user’s guide for the widely adopted GeoDa open source software for spatial analysis. Leveraging a large number of real-world empirical illustrations, readers will gain an understanding of the main concepts and techniques, using dynamic graphics for thematic mapping, statistical graphing, and, most centrally, the analysis of spatial autocorrelation. Key to this analysis is the concept of local indicators of spatial association, pioneered by the author and recently extended to the analysis of multivariate data.

    The focus of the book is on intuitive methods to discover interesting patterns in spatial data. It offers a progression from basic data manipulation through description and exploration, to the identification of clusters and outliers by means of local spatial autocorrelation analysis. A distinctive approach is to spatialize intrinsically non-spatial methods, by means of linking and brushing with a range of map representations, including several that are unique to the GeoDa software. The book also represents the most in-depth treatment of local spatial autocorrelation and its visualization and interpretation by means of GeoDa.

    This book is the second in a two-volume series that introduces the field of spatial data science. It moves beyond pure data exploration to the organization of observations into meaningful groups, i.e., spatial clustering. This constitutes an important component of so-called unsupervised learning, a major aspect of modern machine learning.

    The distinctive aspects of the book are both to explore ways to spatialize classic clustering methods through linked maps and graphs, as well as the explicit introduction of spatial contiguity constraints into clustering algorithms. Leveraging a large number of real-world empirical illustrations, readers will gain an understanding of the main concepts and techniques and their relative advantages and disadvantages. The book also constitutes the definitive user’s guide for these methods as implemented in the GeoDa open source software for spatial analysis.

    It is organized into three major parts, dealing with dimension reduction (principal components, multi-dimensional scaling, stochastic network embedding), classic clustering methods (hierarchical clustering, k-means, k-medians, k-medoids and spectral clustering), and spatially constrained clustering methods (both hierarchical and partitioning). It closes with an assessment of spatial and non-spatial cluster properties.

    The book is intended for readers interested in going beyond simple mapping of geographical data to gain insight into interesting patterns as expressed in spatial clusters of observations. Familiarity with the material in Volume 1 is assumed, especially the analysis of local spatial autocorrelation and the full range of visualization methods.

    Volume 1

    1: Introduction.

    2: Basic Data Operations.

    3: GIS Operations.

    4: Geovisualization.

    5: Statistical Maps.

    6: Maps for Rates.

    7: Univariate and Bivariate Data Exploration.

    8: Multivariate Data Exploration.

    9: Space-Time Exploration.

    10: Contiguity-Based Spatial Weights.

    11: Distance-Based Spatial Weights.

    12: Special Weights Operations.

    13: Spatial Autocorrelation.

    14: Advanced Global Spatial Autocorrelation.

    15: Nonparametric Spatial Autocorrelation.

    16: LISA and Local Moran.

    17: Other Local Spatial Autocorrelation Statistics.

    18: Multivariate Local Spatial Autocorrelation.

    19: LISA for Discrete Variables.

    20: Density-Based Clustering Methods.

    21: Postscript - The Limits of Exploration.



    Volume 2

    1. Introduction

    Part 1: Dimension Reduction

    2. Principal Component Analysis (PCA)

    3. Multidimensional Scaling (MDS)

    4. Stochastic Neighbor Embedding (SNE)

    Part 2: Classic Clustering

    5. Hierarchical Clustering Methods

    6. Partioning Clustering Methods

    7. Advanced Clustering Methods

    8. Spectral Clustering

    Part 3: Spatial Clustering

    9. Spatializing Classic Clustering Methods

    10. Spatially Constrained Clustering - Hierarchical Methods

    11. Spatially Constrained Clustering - Partitioning Methods

    Part 4: Assessment

    12. Cluster Validation


    Luc Anselin is the Founding Director of the Center for Spatial Data Science at the University of Chicago, where he is also Stein-Freiler Distinguished Service Professor of Sociology and the College, as well as a member of the Committee on Data Science. He is the creator of the GeoDa software and an active contributor to the PySAL Python open source software library for spatial analysis. He has written widely on topics dealing with the methodology of spatial data analysis, including his classic 1988 text on Spatial Econometrics. His work has been recognized by many awards, such as his election to the U.S. National Academy of Science and the American Academy of Arts and Science.