1st Edition

Data Science for Infectious Disease Data Analytics An Introduction with R

By Lily Wang Copyright 2023
    419 Pages 129 B/W Illustrations
    by Chapman & Hall

    419 Pages 129 B/W Illustrations
    by Chapman & Hall

    Data Science for Infectious Disease Data Analytics: An Introduction with R provides an overview of modern data science tools and methods that have been developed specifically to analyze infectious disease data. With a quick start guide to epidemiological data visualization and analysis in R, this book spans the gulf between academia and practices providing many lively, instructive data analysis examples using the most up-to-date data, such as the newly discovered coronavirus disease (COVID-19).

    The primary emphasis of this book is the data science procedures in epidemiological studies, including data wrangling, visualization, interpretation, predictive modeling, and inference, which is of immense importance due to increasingly diverse and nonexperimental data across a wide range of fields. The knowledge and skills readers gain from this book are also transferable to other areas, such as public health, business analytics, environmental studies, or spatio-temporal data visualization and analysis in general.

    Aimed at readers with an undergraduate knowledge of mathematics and statistics, this book is an ideal introduction to the development and implementation of data science in epidemiology.


    • Describes the entire data science procedure of how the infectious disease data are collected, curated, visualized, and fed to predictive models, which facilitates effective communication between data sources, scientists, and decision-makers.
    • Explains practical concepts of infectious disease data and provides particular data science perspectives.
    • Overview of the unique features and issues of infectious disease data and how they impact epidemic modeling and projection.
    • Introduces various classes of models and state-of-the-art learning methods to analyze infectious diseases data with valuable insights on how different models and methods could be connected.

    Chapter 1 Introduction

    Chapter 2 Data Wrangling

    Chapter 3 Data Visualization with R Package “ggplot2”

    Chapter 4 Interactive Visualization

    Chapter 5 R Shiny

    Chapter 6 Interactive Geospatial Visualization

    Chapter 7 Epidemic Modeling

    Chapter 8 Compartment Models

    Chapter 9 Time Series Analysis of Infectious Disease Data

    Chapter 10 Regression Methods

    Chapter 11 Neural Networks

    Chapter 12 Hybrid Models

    Appendix A

    Appendix B

    Appendix C


    Dr. Lily Wang is a tenured professor of statistics at George Mason University. She earned her PhD in statistics from Michigan State University in 2007. Before joining Mason in 2021, she was on the faculty of Iowa State University (2014-2021) and the University of Georgia (2007-2014). Her primary research areas include non/semi-parametric modeling and inference, statistical learning of data objects with complex features, methodologies for functional data, spatiotemporal data, imaging, and general issues related to data science and big data analytics. Dr. Wang is a fellow of both the Institute of Mathematical Statistics and the American Statistical Association and an Elected Member of the International Statistical Institute. She is currently serving on the editorial board of Journal of the Royal Statistical Society, Series B, Journal of Nonparametric Statistics and Statistical Analysis and Data Mining.