# Introduction to Functional Data Analysis

## Preview

## Book Description

Introduction to Functional Data Analysis provides a concise textbook introduction to the field. It explains how to analyze functional data, both at exploratory and inferential levels. It also provides a systematic and accessible exposition of the methodology and the required mathematical framework.

The book can be used as textbook for a semester-long course on FDA for advanced undergraduate or MS statistics majors, as well as for MS and PhD students in other disciplines, including applied mathematics, environmental science, public health, medical research, geophysical sciences and economics. It can also be used for self-study and as a reference for researchers in those fields who wish to acquire solid understanding of FDA methodology and practical guidance for its implementation. Each chapter contains plentiful examples of relevant R code and theoretical and data analytic problems.

The material of the book can be roughly divided into four parts of approximately equal length: 1) basic concepts and techniques of FDA, 2) functional regression models, 3) sparse and dependent functional data, and 4) introduction to the Hilbert space framework of FDA. The book assumes advanced undergraduate background in calculus, linear algebra, distributional probability theory, foundations of statistical inference, and some familiarity with R programming. Other required statistics background is provided in scalar settings before the related functional concepts are developed. Most chapters end with references to more advanced research for those who wish to gain a more in-depth understanding of a specific topic.

## Table of Contents

First steps in the analysis of functional data

Basis expansions

Sample mean and covariance

Principal component functions

Analysis of BOA stock returns

Diffusion tensor imaging

Problems

**Further topics in exploratory FDA **

Derivatives

Penalized smoothing

Curve alignment

Further reading

Problems

**Mathematical framework for functional data **

Square integrable functions

Random functions

Linear transformations

**Scalar- on - function regression**

Examples

Review of standard regression theory

Difficulties specific to functional regression

Estimation through a basis expansion

Estimation with a roughness penalty

Regression on functional principal components

Implementation in the refund package

Nonlinear scalar-on-function regression

Problems

**Functional response models **

Least squares estimation and application to angular motion

Penalized least squares estimation

Functional regressors

Penalized estimation in the refund package

Estimation based on functional principal components

Test of no effect

Verification of the validity of a functional linear model

Extensions and further reading

Problems

Functional generalized linear models

Background

Scalar-on-function GLM's

Functional response GLM

Implementation in the refund package

Application to DTI

Further reading

Problems

**Sparse FDA **

Introduction

Mean function estimation

Covariance function estimation

Sparse functional PCA

Sparse functional regression

Problems

**Functional time series**

Fundamental concepts of time series analysis

Functional autoregressive process

Forecasting with the Hyndman-Ullah method

Forecasting with multivariate predictors

Long-run covariance function

Testing stationarity of functional time series

Generation and estimation of the FAR(1) model using package fda

Conditions for the existence of the FAR(1) process

Further reading and other topics

Problems

**Spatial functional data and models**

Fundamental concepts of spatial statistics

Functional spatial fields

Functional kriging

Mean function estimation

Implementation in the R package geofd

Other topics and further reading

Problems

**Elements of Hilbert space theory**

Hilbert space

Projections and orthonormal sets

Linear operators

Basics of spectral theory

Tensors

Problems

Random functions

Random elements in metric spaces

Expectation and covariance in a Hilbert space

Gaussian functions and limit theorems

Functional principal components

Problems

**Inference from a random sample**

Consistency of sample mean and covariance functions

Estimated functional principal components

Asymptotic normality

Hypothesis testing about the mean

Confidence bands for the mean

Application to BOA cumulative returns

Proof of Theorem

Problems

## Author(s)

### Biography

**Piotr Kokoszka** is a professor of statistics at Colorado State University. His research interests include functional data analysis, with emphasis on dependent data structures, and applications to geosciences and finance. He is a coauthor of the monograph *Inference for Functional Data with Applications* (with L. Horváth). He is an associate editor of several journals, including *Computational Statistics and Data Analysis*, *Journal of Multivariate Analysis*, *Journal of Time Series Analysis*, and S*candinavian Journal of Statistics*.

**Matthew Reimherr** is an assistant professor of statistics at Pennsylvania State University. His research interests include functional data analysis, with emphasis on longitudinal studies and applications to genetics and public health. He is an associate editor of *Statistical Modeling*.

## Reviews

"This well-written book provides a great and intuitive introduction to functional data analysis (FDA) which has emerged as an important area in statistics and found tons of scientific applications...This book succeeds at introducing this novel statistical concept and methodology while keeps the level of mathematical and statistical sophistication required to understand at the level of an introductory graduate-level course, which makes for pleasant reading. A nice feature of the book is its strong focus on implementation using R, which makes it a great candidate of textbooks or reference books for (master-level) graduate students and applied researchers...Some unique features of this book as compared to existing ones include (1) its strong focus on implementation using R; (2) chapters on Sparse FDA, generalized functional linear models, functional time series, and spatial functional data; (3) well-designed exercises that can be used as homework problems."

~Xianyang Zhang, Texas A&M University"The main advantage of the book is its emphasis introducing the material through realistic examples and computational tools, while also providing mathematical guidance for the methodologies. Also, important topics like functional time series and spatial functional data are not adequately covered in comparable texts like Ramsay and Silverman, Ramsay and Hooker, Ferraty and Vieu, and Hsing and Eubank. In that respect, the book offers additional and practically relevant material and perspective."

~Debashis Paul, University of California, Davis"The classic tools from the field of functional data analysis are introduced comprehensively and immediately put into a framework of potential application. I would probably advise any reader that is new to functional data analysis to start by reading this book."

~Claudia Klüppelberg, Technische Universität München"Being more advanced and up to date than the Ramsay and Silverman, it complements various topics that are just briefly mentioned or not covered at all by Ramsay and Silverman."

~

Laura Sangali, Politecnico di Milano"As a relatively young subfield of statistics, functional data analysis (FDA) has not had a large glut of textbooks pertaining to it. The most famous of the FDA books is the classic text by J. O. Ramsay and B. W. Silverman [Functional data analysis, Springer Ser. Statist., Springer, New York, 1997; second edition, 2005; MR2168993], which introduced many statisticians to the area. Ramsay and Silverman [Applied functional data analysis, Springer Ser. Statist., Springer, New York, 2002; MR1910407] provided a useful collection of FDA case studies, and Ramsay, G. Hooker and S. Graves [Func-tional data analysis with R and MATLAB, Use R, Springer, New York, 2009, doi:10. 1007/978-0-387-98185-7] presented R and MATLAB code for analyzing real functional data sets. [F. Ferraty and P. Vieu, Nonparametric functional data analysis, Springer Ser. Statist., Springer, New York, 2006; MR2229687] and [T. Hsing and R. L. Eubank, Theoretical foundations of functional data analysis, with an introduction to linear opera- tors, Wiley Ser. Probab. Stat., Wiley, Chichester, 2015; MR3379106] are well-respected theoretical presentations of FDA.

This book by Kokoszka and Reimherr provides a nice mix of foundational material, accessible theory, and practical examples (including much R code). It is a valuable addition to the FDA literature, and is perhaps an ideal choice of a course textbook for either an undergraduate or graduate course in FDA, whereas several of the other textbooks are more valuable as references for researchers and practitioners than as tutorials for learners. At the end of each chapter is a nice variety of problems that instructors could use for homework assignments.

Chapter 1 introduces basic terminology related to FDA, such as the ubiquitous tool of basis expansion and the distinction between dense and sparse functional data. Summary statistics and plots (sample mean and covariance functions, principal components analysis (PCA), functional boxplots) for FDA are brie

y presented. Chapter 2 continues basic FDA topics with a discussion of derivative information, penalized smoothing, and alignment/registration of curves.The theoretical underpinnings of FDA are presented quickly in Chapter 3, where topics such as square integrable functions, random functions following some distribution, and operator theory are defined brie

y. A fuller coverage of theoretical concerns is saved for (the optional in a course setting) Chapters 10 and 11. The heart of the book is Chapters 4 through 9, which cover functional linear models in detail, before moving on to specialized FDA topics such as sparse FDA, functional

time series, and spatial functional data.Scalar-on-function regression, in which the response is a scalar and the predictor is a function, is treated in Chapter 4, and illustrated via the use of the refund package in R. Nonlinear scalar-on-function regression is brie

y mentioned. Chapter 5 covers both the function-on-scalar regression case and the fully functional regression model in which both response and predictor are functions. Testing and validation of the functional linear model are also shown. Chapter 6 covers functional generalized linear models (GLMs) which have a nonnormal scalar response and a functional predictor. The somewhat

nebulous situation with functional-response GLMs is brie

y covered as well.The next chapter deals with sparse functional data, and presents methods for mean function estimation, covariance function estimation, PCA, and regression in the sparse case when relatively few points are measured for each observed curve. Functional time series occur when the sample functions are observed sequentially over time rather than cross-sectionally. The assumption of independent functional data fails in this case, and Chapter 8 presents a functional autoregressive model for such data that can be used for forecasting. Spatial functional data may commonly be encountered in geostatistics when curves are observed both over time and at various spatial locations. Chapter 9 discusses models for such data and prediction using functional kriging.

Chapter 12 discusses treating a functional data set as a sample from some population of functions and performing inference on the population. Of particular interest are the methods presented for formal hypothesis tests and confidence bands about the population mean function.

Clustering and classification of functional data are not discussed in detail in this

book, nor is FDA on manifolds, although references are given to guide readers to recent

research in these areas."~

David Benner Hitchcock