1st Edition

# Time Series for Data Science Analysis and Forecasting

528 Pages 272 B/W Illustrations
by Chapman & Hall

528 Pages 272 B/W Illustrations
by Chapman & Hall

Also available as eBook on:

Data Science students and practitioners want to find a forecast that “works” and don’t want to be constrained to a single forecasting strategy, Time Series for Data Science: Analysis and Forecasting discusses techniques of ensemble modelling for combining information from several strategies. Covering time series regression models, exponential smoothing, Holt-Winters forecasting, and Neural Networks. It places a particular emphasis on classical ARMA and ARIMA models that is often lacking from other textbooks on the subject.

This book is an accessible guide that doesn’t require a background in calculus to be engaging but does not shy away from deeper explanations of the techniques discussed.

Features:

• Provides a thorough coverage and comparison of a wide array of time series models and methods: Exponential Smoothing, Holt Winters, ARMA and ARIMA, deep learning models including RNNs, LSTMs, GRUs, and ensemble models composed of combinations of these models.
• Introduces the factor table representation of ARMA and ARIMA models. This representation is not available in any other book at this level and is extremely useful in both practice and pedagogy.
• Uses real world examples that can be readily found via web links from sources such as the US Bureau of Statistics, Department of Transportation and the World Bank.
• There is an accompanying R package that is easy to use and requires little or no previous R experience. The package implements the wide variety of models and methods presented in the book and has tremendous pedagogical use.

Working with Data Collected Over Time

1.1 Introduction

1.2 Time Series Datasets

1.2.1 Cyclic Data

1.2.2 Trends

1.3 The Programming Language R

1.3.1 The tswge Time Series Package

1.3.2 Base R

1.3.3 Plotting Time Series Data in R

1.3.4 The ts object

1.3.5 The plotts.wge function in tswge

1.3.7 Accessing Time Series Data

1.4 Dealing with Messy Data

1.4.1 Preparing Time Series Data for Analysis: Cleaning, Wrangling and

Imputation

1.5 Concluding Remarks

Appendix 1

Exploring Time Series Data

2.1 Understanding and Visualizing Data

2.1.1 Smoothing Time Series Data

2.2 Forecasting

2.2.1 Predictive Moving Average Smoother

2.2.2 Exponential Smoothing

2.2.3 Holt-Winters Forecasting

2.2.4 Assessing the Accuracy of Forecasts

2.3 Concluding Remarks

Appendix 2

Statistical Basics for Time Series Analysis

3.1 Statistics Basics

3.1.1 Univariate Data

3.1.2 Multivariate Data

3.1.3 Independent vs Dependent Data

3.2 Time Series and Realizations

3.2.1 Multiple Realizations

3.2.2 The Effect of Realization Length

3.3 Stationary Time Series

3.3.1 Plotting the autocorrelations of a stationary process

3.3.2 Estimating the parameters of a stationary process

3.4 Concluding Remarks

Appendix 3

The Frequency Domains

4.1 Trigonometric Review and Terminology

4.2  The Spectral Density

4.2.1 Euler’s Formula

4.2.2 Definition and Properties of the Spectrum and Spectral Density

4.2.3 Estimating the Spectral Density

4.3 Smoothing and Filtering

4.3.1 Types of Filters

4.3.2 The Butterworth Filter

4.4 Concluding Remarks

Appendix 4

ARMA Models

5.1 The Autoregressive Model

5.1.1 The AR(1) Model

5.1.2 The AR(2) Model

5.1.3 The AR(p) Model

5.1.4 Linear Filters, the General Linear process, and AR(p) Models

5.2 Autoregressive-Moving Average(ARMA) Models

5.2.1 Moving Average Models

5.2.2  ARMA(p,q) Models

5.3 Concluding Remarks

Appendix 5

ARMA Fitting and Forecasting

6.1 Fitting ARMA Models to Data

6.1.1. Estimating the Parameters of an ARMA(p,q) Model

6.1.2 ARMA Model Identification

6.2 Forecasting using an ARMA(p,q) Model

6.2.1 Forecasting Setting, Notation, and Strategy

6.2.2 Forecasting using an AR(p) Model

6.2.3 Basic Forecasting Formula using an ARMA(p,q) Model

6.2.4 Eventual Forecast Function

6.2.5 Probability Limits for ARMA Forecasts

6.2.6 Assessing Forecast Performance

6.3 Concluding Remarks

Appendix 6

ARIMA, Seasonal, and ARCH/GARCH Models

7.1 ARIMA(p,d,q) Models

7.1.1 Properties of ARIMA(p,d,q) Models

7.1.2 Model Identification and Parameter Estimation of ARIMA(p,d,q)

Models

7.1.3 Forecasting with ARIMA Models

7.2 Seasonal Models

7.2.1 Properties of Seasonal Models

7.2.2 Fitting Seasonal Models to Data

7.2.3 Forecasting using Seasonal Models

7.3 ARCH and GARCH Models

7.3.1 The ARCH(1) Model

7.3.2 The ARCH(p) and GARCH(p,q) Processes

7.3.3 Assessing the Appropriateness of an ARCH/GARCH Fit to a Set of Data

7.3.4  Fitting ARCH/GARCH Models to Simulated Data

7.3.5 Modeling Daily Rates of Return Data

7.4  Concluding Remarks

Appendix 7

Time Series Regression

8.1 Line+Noise Models

8.1.1 Testing for Linear Trend

8.1.2 Fitting Line+Noise Models to Data

8.1.3 Forecasting using Line+Noise Models

8.2 Cosine Signal+Noise Models

8.2.1 Fitting a Cosine Signal+Noise Model to Data

8.2.2 Forecasting with Cosine Signal+Noise Models

8.2.3 Deciding Whether to Fit a Cosine Signal+Noise Model to a Set of Data

8.3 Concluding Remarks

Appendix 8

9. Model Assessment

9.1 Residual Analysis

9.1.1 Checking Residuals for White Noise

9.1.2 Checking the Residuals for Normality

9.2 Modeling the Global Temperature

9.2.1  A Stationary Model

9.2.2 A Correlation-Based Model with a Unit Root

9.2.3 Line+Noise Models for the Global Temperature Data

9.2.4?? Holt-Winter, Neural Networks??

9.3 Comparing Models for the Sunspot Data

9.3.1  Selecting the Models for Comparison

9.3.2 Do the models whiten the residuals?

9.3.3 Do realizations and their characteristics behave like the data?

9.3.4 Do Forecasts Reflect what is Known about the Physical Setting?

9.4 Comprehensive Time Series Analysis

9.5 Concluding Remarks

Appendix 9

10.  Multivariate Time Series

10.1 Introduction

10.2 Multiple Regression with Correlated errors

10.2.1 Notation for Multiple Regression with Correlated Errors

10.2.2 Fitting Multiple Regression Models to Time Series Data

10.2.3 Cross Correlation

10.3 Vector Autoregressive (VAR) Models

10.3.1 The VAR(1), VAR(2) and VAR(p) Models

10.3.2 Forecasting with VAR(p) Models

10.4 Relationship Between MLR and VAR Models

10.5 Comprehensive and Final Example: Los Angeles Cardiac Mortality

10.6 Concluding Remarks

Appendix 10

11. Deep Neural Network Based Time Series Models

11.1 Introduction

11.2 The Perceptron

11.3 The Extended Perceptron for Univariate Time Series Data

11.3.1   A Neural Network Similar to the AR(1)

11.3.2   A Neural Network Similar to AR(p): Adding More Lags

11.3.3   A Deeper Neural Network: Adding a Hidden Layer

11.4 The Extended Perceptron for Multivariate Time Series Data

11.4.1 Forecasting Melanoma using Sunspots

11.4.2 Forecasting Cardiac Mortality Using Temperature and Particulates

11.5 An Ensemble Models

11.6 Concluding Remarks

Appendix 11

### Biography

Wayne Woodward, Bivin Sadler, Stephen Robertson

"A well-structured text aimed at undergraduates pursuing a data science curriculum, or MBA students. The authors draw upon their vast combined experience in research and teaching to a variety of audiences to present the classical material on ARMA-based Box-Jenkins methodology without assuming a calculus background. Yet, their approach manages to be heuristic, while not sacrificing relevant theoretical detail that enriches understanding. The authors complement this material with chapters on multivariate models, and, refreshingly, a very enlightening discussion on neural networks. The exposition is lucid, well-organized, and copiously illustrated to reinforce comprehension of concepts. The companion R package (tswge) finds a niche in the growing list of time series toolboxes, by providing clean, straightforward functionality on such essentials as spectrum reconstruction and model factor tables to glean the structure of AR and MA polynomials."
- Alex Trindade, Texas Tech University