1st Edition

Time Series for Data Science Analysis and Forecasting

    528 Pages 272 B/W Illustrations
    by Chapman & Hall

    528 Pages 272 B/W Illustrations
    by Chapman & Hall

    528 Pages 272 B/W Illustrations
    by Chapman & Hall

    Data Science students and practitioners want to find a forecast that “works” and don’t want to be constrained to a single forecasting strategy, Time Series for Data Science: Analysis and Forecasting discusses techniques of ensemble modelling for combining information from several strategies. Covering time series regression models, exponential smoothing, Holt-Winters forecasting, and Neural Networks. It places a particular emphasis on classical ARMA and ARIMA models that is often lacking from other textbooks on the subject.

    This book is an accessible guide that doesn’t require a background in calculus to be engaging but does not shy away from deeper explanations of the techniques discussed.

    Features:

    • Provides a thorough coverage and comparison of a wide array of time series models and methods: Exponential Smoothing, Holt Winters, ARMA and ARIMA, deep learning models including RNNs, LSTMs, GRUs, and ensemble models composed of combinations of these models.
    • Introduces the factor table representation of ARMA and ARIMA models. This representation is not available in any other book at this level and is extremely useful in both practice and pedagogy.
    • Uses real world examples that can be readily found via web links from sources such as the US Bureau of Statistics, Department of Transportation and the World Bank.
    • There is an accompanying R package that is easy to use and requires little or no previous R experience. The package implements the wide variety of models and methods presented in the book and has tremendous pedagogical use.

    Working with Data Collected Over Time

    1.1 Introduction

    1.2 Time Series Datasets

          1.2.1 Cyclic Data

          1.2.2 Trends

    1.3 The Programming Language R

                         1.3.1 The tswge Time Series Package

                      1.3.2 Base R

                      1.3.3 Plotting Time Series Data in R

                      1.3.4 The ts object

                      1.3.5 The plotts.wge function in tswge

                      1.3.6 Loading Time Series Data into R

          1.3.7 Accessing Time Series Data

    1.4 Dealing with Messy Data

                      1.4.1 Preparing Time Series Data for Analysis: Cleaning, Wrangling and

                                Imputation

    1.5 Concluding Remarks

    Appendix 1

     

    Exploring Time Series Data

    2.1 Understanding and Visualizing Data

          2.1.1 Smoothing Time Series Data

          2.1.2 Seasonal Adjustment

    2.2 Forecasting

                      2.2.1 Predictive Moving Average Smoother

                      2.2.2 Exponential Smoothing

                      2.2.3 Holt-Winters Forecasting

          2.2.4 Assessing the Accuracy of Forecasts

    2.3 Concluding Remarks

    Appendix 2

     

    Statistical Basics for Time Series Analysis

    3.1 Statistics Basics

           3.1.1 Univariate Data

                       3.1.2 Multivariate Data

                     3.1.3 Independent vs Dependent Data

               3.2 Time Series and Realizations

         3.2.1 Multiple Realizations

                     3.2.2 The Effect of Realization Length

                3.3 Stationary Time Series

                    3.3.1 Plotting the autocorrelations of a stationary process

                    3.3.2 Estimating the parameters of a stationary process

       3.4 Concluding Remarks

       Appendix 3

     

    The Frequency Domains

    4.1 Trigonometric Review and Terminology

    4.2  The Spectral Density

                       4.2.1 Euler’s Formula

                       4.2.2 Definition and Properties of the Spectrum and Spectral Density

                       4.2.3 Estimating the Spectral Density

               4.3 Smoothing and Filtering

                      4.3.1 Types of Filters

                      4.3.2 The Butterworth Filter

               4.4 Concluding Remarks

                Appendix 4

     

    ARMA Models

    5.1 The Autoregressive Model

           5.1.1 The AR(1) Model

           5.1.2 The AR(2) Model

           5.1.3 The AR(p) Model

           5.1.4 Linear Filters, the General Linear process, and AR(p) Models

    5.2 Autoregressive-Moving Average(ARMA) Models

          5.2.1 Moving Average Models

          5.2.2  ARMA(p,q) Models

    5.3 Concluding Remarks

    Appendix 5

     

    ARMA Fitting and Forecasting

    6.1 Fitting ARMA Models to Data

                6.1.1. Estimating the Parameters of an ARMA(p,q) Model

          6.1.2 ARMA Model Identification

    6.2 Forecasting using an ARMA(p,q) Model

          6.2.1 Forecasting Setting, Notation, and Strategy

          6.2.2 Forecasting using an AR(p) Model

          6.2.3 Basic Forecasting Formula using an ARMA(p,q) Model

          6.2.4 Eventual Forecast Function

          6.2.5 Probability Limits for ARMA Forecasts

          6.2.6 Assessing Forecast Performance

    6.3 Concluding Remarks

    Appendix 6

     

    ARIMA, Seasonal, and ARCH/GARCH Models

    7.1 ARIMA(p,d,q) Models

          7.1.1 Properties of ARIMA(p,d,q) Models

          7.1.2 Model Identification and Parameter Estimation of ARIMA(p,d,q)

                   Models

          7.1.3 Forecasting with ARIMA Models

    7.2 Seasonal Models

          7.2.1 Properties of Seasonal Models

          7.2.2 Fitting Seasonal Models to Data

          7.2.3 Forecasting using Seasonal Models

    7.3 ARCH and GARCH Models

          7.3.1 The ARCH(1) Model

          7.3.2 The ARCH(p) and GARCH(p,q) Processes

                      7.3.3 Assessing the Appropriateness of an ARCH/GARCH Fit to a Set of Data

                      7.3.4  Fitting ARCH/GARCH Models to Simulated Data

                      7.3.5 Modeling Daily Rates of Return Data

    7.4  Concluding Remarks

    Appendix 7

     

    Time Series Regression

    8.1 Line+Noise Models

                 8.1.1 Testing for Linear Trend

                 8.1.2 Fitting Line+Noise Models to Data

                 8.1.3 Forecasting using Line+Noise Models

         8.2 Cosine Signal+Noise Models

          8.2.1 Fitting a Cosine Signal+Noise Model to Data

          8.2.2 Forecasting with Cosine Signal+Noise Models

          8.2.3 Deciding Whether to Fit a Cosine Signal+Noise Model to a Set of Data

    8.3 Concluding Remarks

    Appendix 8

     

           9. Model Assessment

               9.1 Residual Analysis

                      9.1.1 Checking Residuals for White Noise

                      9.1.2 Checking the Residuals for Normality

               9.2 Modeling the Global Temperature

                     9.2.1  A Stationary Model

                     9.2.2 A Correlation-Based Model with a Unit Root

                     9.2.3 Line+Noise Models for the Global Temperature Data

                     9.2.4?? Holt-Winter, Neural Networks??

              9.3 Comparing Models for the Sunspot Data

                     9.3.1  Selecting the Models for Comparison

                     9.3.2 Do the models whiten the residuals?

                     9.3.3 Do realizations and their characteristics behave like the data?

                     9.3.4 Do Forecasts Reflect what is Known about the Physical Setting?

               9.4 Comprehensive Time Series Analysis

               9.5 Concluding Remarks

              Appendix 9

     

    10.  Multivariate Time Series

           10.1 Introduction

           10.2 Multiple Regression with Correlated errors

                   10.2.1 Notation for Multiple Regression with Correlated Errors

                   10.2.2 Fitting Multiple Regression Models to Time Series Data

                   10.2.3 Cross Correlation

            10.3 Vector Autoregressive (VAR) Models

                   10.3.1 The VAR(1), VAR(2) and VAR(p) Models

                   10.3.2 Forecasting with VAR(p) Models

            10.4 Relationship Between MLR and VAR Models

            10.5 Comprehensive and Final Example: Los Angeles Cardiac Mortality

            10.6 Concluding Remarks

             Appendix 10

     

    11. Deep Neural Network Based Time Series Models

          11.1 Introduction

          11.2 The Perceptron

          11.3 The Extended Perceptron for Univariate Time Series Data

       11.3.1   A Neural Network Similar to the AR(1)

                   11.3.2   A Neural Network Similar to AR(p): Adding More Lags

                   11.3.3   A Deeper Neural Network: Adding a Hidden Layer

           11.4 The Extended Perceptron for Multivariate Time Series Data

                   11.4.1 Forecasting Melanoma using Sunspots

                   11.4.2 Forecasting Cardiac Mortality Using Temperature and Particulates

           11.5 An Ensemble Models

           11.6 Concluding Remarks

            Appendix 11

    Biography

    Wayne Woodward, Bivin Sadler, Stephen Robertson

    "A well-structured text aimed at undergraduates pursuing a data science curriculum, or MBA students. The authors draw upon their vast combined experience in research and teaching to a variety of audiences to present the classical material on ARMA-based Box-Jenkins methodology without assuming a calculus background. Yet, their approach manages to be heuristic, while not sacrificing relevant theoretical detail that enriches understanding. The authors complement this material with chapters on multivariate models, and, refreshingly, a very enlightening discussion on neural networks. The exposition is lucid, well-organized, and copiously illustrated to reinforce comprehension of concepts. The companion R package (tswge) finds a niche in the growing list of time series toolboxes, by providing clean, straightforward functionality on such essentials as spectrum reconstruction and model factor tables to glean the structure of AR and MA polynomials."
    - Alex Trindade, Texas Tech University