1st Edition

Surrogates Gaussian Process Modeling, Design, and Optimization for the Applied Sciences

By Robert B. Gramacy Copyright 2020
    560 Pages 204 Color Illustrations
    by Chapman & Hall

    560 Pages 204 Color Illustrations
    by Chapman & Hall

    560 Pages 202 Color Illustrations
    by Chapman & Hall

    Surrogates: a graduate textbook, or professional handbook, on topics at the interface between machine learning, spatial statistics, computer simulation, meta-modeling (i.e., emulation), design of experiments, and optimization. Experimentation through simulation, "human out-of-the-loop" statistical support (focusing on the science), management of dynamic processes, online and real-time analysis, automation, and practical application are at the forefront.

    Topics include:

    • Gaussian process (GP) regression for flexible nonparametric and nonlinear modeling.
    • Applications to uncertainty quantification, sensitivity analysis, calibration of computer models to field data, sequential design/active learning and (blackbox/Bayesian) optimization under uncertainty.
    • Advanced topics include treed partitioning, local GP approximation, modeling of simulation experiments (e.g., agent-based models) with coupled nonlinear mean and variance (heteroskedastic) models.
    • Treatment appreciates historical response surface methodology (RSM) and canonical examples, but emphasizes contemporary methods and implementation in R at modern scale.
    • Rmarkdown facilitates a fully reproducible tour, complete with motivation from, application to, and illustration with, compelling real-data examples.

    Presentation targets numerically competent practitioners in engineering, physical, and biological sciences. Writing is statistical in form, but the subjects are not about statistics. Rather, they’re about prediction and synthesis under uncertainty; about visualization and information, design and decision making, computing and clean code.

    1 Historical Perspective
    2 Four Motivating Datasets
    3 Steepest Ascent and Ridge Analysis
    4 Space-filling Design
    5 Gaussian process regression
    6 Model-Based Design for GPs
    7 Optimization
    8 Calibration and Sensitivity
    9 GP Fidelity and Scale
    10 Heteroskedasticity
    Appendix A Numerical Linear Algebra for Fast GPs
    Appendix B An Experiment Game


    Robert B. Gramacy is a professor of Statistics in the College of Science at Virginia Tech. Research interests include Bayesian modeling methodology, statistical computing, Monte Carlo inference, nonparametric regression, sequential design, and optimization under uncertainty. Bobby enjoys cycling and ice hockey, and watching his kids grow up too fast.

    "The coverage of this book is unique and important. It focuses on a current area at the edge of applied mathematics and statistics, a domain that really should be substantially better-developed. For researchers and students who already have a solid foundation in statistics and familiarity with R, and want to know more about how statistics can be used in the approximation of complex functions and numerical optimization (i.e. computer experiments), this should be a welcome resource."
    -Max Morris, Iowa State University, USA

    “This book is a fantastic exploration of Gaussian process surrogates and a variety of applications to which they have been utilized. This approach is rapidly expanding in both the statistical and machine learning communities. I particularly enjoyed the applied focus of this book and the ease with which the author enables the reader to “follow along”, by providing code for each example discussed. In my view, the technical content of the book is well-chosen, and the flow of material should be very well-received by the readership.”
    -Brian J. Williams, Scientist, Los Alamos National Laboratory

    "This book offers a good coverage of Gaussian Process for computer simulation experiments. [. . .] Reading the book for two hours you already forgot the fear you had when you first opened this 543-page long book that is impressively titled as ‘surrogate’. The accessible R examples, the tongue-in-the-cheek tone, the pursuit of simplicity (simple but not simpler), greatly reduced the distance between the reader and the system of knowledge presented in this book. On the other hand, the complexity of the subject matter is never lost but a sense of appreciation of the complexity is enhanced during reading, and unsolved mysteries remain open. Is this the reason why Newton’s ironically long and complicated comment is quoted in the very beginning of the preface to establish the notion of simplicity? At any rate, there are invitations everywhere in the book that are empowering the readers, not just to use the codes for whatever imminent practical tasks at hand, but also embark on methodological pursuit in future to solve the unsolved mysteries. This is a great book for PhD students for sure, but also a good entry point for anyone who opens the book in a hope to readily use some methods as well."
    -Shuai Huang, Journal of Quality Technology

    "At 543 pages in length, the book’s coverage is exhaustive, covering a wide variety of well-chosen theoretical and practical topics in computer modeling. The book does a fantastic job exploring the big topics in computer modeling, including prediction/emulation, uncertainty quantification, calibration, sensitivity analysis, and the sequential optimization of computer experiments.[. . .] In particular, the amount of reproducible code the author provides is really where the book shines. By far the greatest feature of this book is the amount of effort that the author has put into making the concepts understandable via available R code. Almost no equation in the book comes without an accompanying snippet of example code in R. Rather than have an appendix of code in the back of the book, the author wisely peppers the code upfront throughout the text of the chapters making the formulas and examples (down to even the figures) in the book easy to follow along, understand, and replicate. In fact, the entire content of the book is reproducible given the author’s choice to use Rmarkdown for all of the writing. Stitching all of the Rmarkdown files together using bookdown, the author has also made the book and code both easily accessible and widely available to readers (see https://bookdown.org/rbg/surrogates/). This style of writing is intentional and makes it both easy and enjoyable for the reader to follow along. The author has clearly made it a priority tomake the book both appealing and useful to its reader. The author understands that a significant portion the readership of the book will be practitioners and thus has made the book very easy to understand and use, while not sacrificing quality of material. As such, the book has the potential to be the "go to" reference for people entering the area. Likewise, the book would serve well as a text for a class (likely a graduate course) on computer experiments."
    -Tony Pourmohamad, Technometrics

    "In conclusion, Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences is a book that is a fusion of response surface methodology and associated problems with Gaussian process modelling. Gramacy covers a lot of ground while being very attentive to the various fields of machine learning and statistics that have considered Gaussian processes. He has synthesised the knowledge about these topics in a very interesting and fresh manner. The book is a great introduction to Gaussian processes and their use on large-scale datasets, along with their application to various problems in design of experiments. The R code provided will allow users of the book to be able to implement these methods quickly in practice. I look forward to future editions of the book."
    -Debashis Ghosh, International Statistical Review