Missing and Modified Data in Nonparametric Estimation: With R Examples, 1st Edition (Hardback) book cover

Missing and Modified Data in Nonparametric Estimation

With R Examples, 1st Edition

By Sam Efromovich

Chapman and Hall/CRC

448 pages

Purchasing Options:$ = USD
Hardback: 9781138054882
pub: 2018-03-12
SAVE ~$21.00
eBook (VitalSource) : 9781315166384
pub: 2018-03-12
from $49.98

FREE Standard Shipping!


This book presents a systematic and unified approach for modern nonparametric treatment of missing and modified data via examples of density and hazard rate estimation, nonparametric regression, filtering signals, and time series analysis. All basic types of missing at random and not at random, biasing, truncation, censoring, and measurement errors are discussed, and their treatment is explained. Ten chapters of the book cover basic cases of direct data, biased data, nondestructive and destructive missing, survival data modified by truncation and censoring, missing survival data, stationary and nonstationary time series and processes, and ill-posed modifications.

The coverage is suitable for self-study or a one-semester course for graduate students with a prerequisite of a standard course in introductory probability. Exercises of various levels of difficulty will be helpful for the instructor and self-study.

The book is primarily about practically important small samples. It explains when consistent estimation is possible, and why in some cases missing data should be ignored and why others must be considered. If missing or data modification makes consistent estimation impossible, then the author explains what type of action is needed to restore the lost information.

The book contains more than a hundred figures with simulated data that explain virtually every setting, claim, and development. The companion R software package allows the reader to verify, reproduce and modify every simulation and used estimators. This makes the material fully transparent and allows one to study it interactively.

Sam Efromovich is the Endowed Professor of Mathematical Sciences and the Head of the Actuarial Program at the University of Texas at Dallas. He is well known for his work on the theory and application of nonparametric curve estimation and is the author of Nonparametric Curve Estimation: Methods, Theory, and Applications. Professor Sam Efromovich is a Fellow of the Institute of Mathematical Statistics and the American Statistical Association.


"Both researchers and practitioners would find this book useful, since currently missing data is a hot topic in statistics. The mathematical level of the book is definitely intermediate, a good amount of references is offered or is planned to be offered in notes, and an R-package will accompany the book. I can see a broad market of potential readers. It is quite typical for statistics departments to offer a graduate course on missing data at various depths. The proposed book allows an instructor to combine the discussion of missing data with presenting topics in density estimation, regression and time series analysis. The same argument applies to the survival analysis part. The proposed book treats both the missed and modified data simultaneously and via the same nonparametric methodology of series estimation, which makes it a convenient choice for a one semester graduate course that covers nonparametric estimation, missing data and survival analysis. Moreover, the proposed companion R-package would allow an instructor to show the power of this statistical software without going into R programming too deeply… Just a new good book on missing data in nonparametric estimation, or a new good book on nonparametric analysis of survival data, or a new good book on time series analysis with missing data would be of a great interest. And here we have a book that combines all these topics and it proposes a unique approach for solving all involved problems." ~Lyudmila Sakhanenko, Michigan State University

"There is a high demand for a book devoted to nonparametric estimation based on missing and modified data. Furthermore, as it is written, the proposed book can easily be used as a text for an intermediate level graduate course in statistics. Its R-package allows to reproduce all the figures (and there are figures in all sections), and a large number of exercises will make its use in a class much more attractive for an instructor… The book is well written, and the context is of interest to a broad spectrum of potential readers."

~Michael Baron, American University

"The book has multiple strengths. Since the same nonparametric series estimator is used, the reader can concentrate on the missingness and modifying mechanisms, whose statistical/probabilistic descriptions are mathematically rigorously introduced. For each setting it is explained how Fourier estimates are constructed. Then the reader can check the performance of the estimator by means of simulations, using different parameters. Exercises are provided at the end of each chapter. Because of these strengths, the book would be an excellent textbook for Master and PhD level graduate courses."

~Ursula Mueller, Texas A&M University

"This book focuses on orthonormal series estimates of densities and curve regressions for missing and modified data. It consists of ten chapters, each accompanied by a collection of mainly theoretical exercises and very informative notes on the literature. In Chapter 1, the author introduces the problems and discusses some basic concepts in probability and statistics. A discussion on statistical software, especially regarding usage of the package provided by the author to reproduce the examples in the book, is also included. The R programs are useful as they can be applied to reproduce the examples, but they are not su?ciently commented.

A brief review of the orthonormal series estimation method in complete data cases is provided in Chapter 2. This review is very helpful in refreshing and introducing the notation. Although the author claims that the book is self-contained, and is suitable for graduate students with a standard course in introductory probability, I would recommend that the reader who is not familiar with the method first consult the author's previous book, Nonparametric curve estimation [Springer Ser. Statist., Springer, New York, 1999; MR1705298]. A general discussion of the estimation for basic models for biased data is provided in Chapter 3. In Chapter 4, the author discusses density estimation and curve regression in various missing at random (MAR) situations. In Chapter 5, he talks about missing not at random (MNAR) cases. He shows that in general MNAR cases, consistent estimates cannot be obtained. Rather, consistent estimators are obtained in some special cases where additional assumptions essentially make the missing data MAR. Chapters 6 and 7 deal with survival data. Chapter 6 discusses the estimation of hazard rates and distribution functions for survival data where the time to event data are modifid with right censoring and left truncation, and Chapter 7 further studies the estimation of survival data when there are missing values, in particular missingness in the censoring indicator and predictors. Chapters 8 and 9 discuss time series with modified and missing data, both stationary and non-stationary. Chapter 10 deals with various ill-posed modifications including measurement errors in density estimation and curve regression, density deconvolution for missing data and censored data, and estimation of erivatives.

This book is devoted to orthonormal series estimates, and thus other popular non-parametric methods such as kernel methods are not discussed. In this respect, the title of the book may be a little bit misleading. Additionally, although the name is not ex-plicitly mentioned, the book mainly follows the inverse probability weighting approach. It would be nice if other approaches such as multiple imputation were discussed and compared. Nevertheless, this is the first book that comprehensively covers orthonormal series estimates of densities and curve regressions for missing and modified data, and it should be valuable to anybody who is interested in the field."

~Wan Tang - Mathematical Reviews Clippings - December 2018

Table of Contents

Introduction. Estimation for Directly Observed Data. Estimation for Basic Models of Modified Data. Nondestructive Missing. Destructive Missing. Survival Analysis. Missing Data in Survival Analysis. Time Series Analysis. Ill-Posed Modifications.

About the Series

Chapman & Hall/CRC Monographs on Statistics and Applied Probability

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
MATHEMATICS / Probability & Statistics / General