1st Edition

The Energy of Data and Distance Correlation

By Gábor J. Székely, Maria L. Rizzo Copyright 2023
466 Pages 7 Color & 24 B/W Illustrations
by Chapman & Hall

466 Pages 7 Color & 24 B/W Illustrations
by Chapman & Hall

Also available as eBook on:

Energy distance is a statistical distance between the distributions of random vectors, which characterizes equality of distributions. The name energy derives from Newton's gravitational potential energy, and there is an elegant relation to the notion of potential energy between statistical observations. Energy statistics are functions of distances between statistical observations in metric spaces. The authors hope this book will spark the interest of most statisticians who so far have not explored E-statistics and would like to apply these new methods using R. The Energy of Data and Distance Correlation is intended for teachers and students looking for dedicated material on energy statistics, but can serve as a supplement to a wide range of courses and areas, such as Monte Carlo methods, U-statistics or V-statistics, measures of multivariate dependence, goodness-of-fit tests, nonparametric methods and distance based methods.

•E-statistics provides powerful methods to deal with problems in multivariate inference and analysis.

•Methods are implemented in R, and readers can immediately apply them using the freely available energy package for R.

•The proposed book will provide an overview of the existing state-of-the-art in development of energy statistics and an overview of applications.

•Background and literature review is valuable for anyone considering further research or application in energy statistics.

Part 1: The Energy of Data  1. Introduction  2. Preliminaries  3. Energy Distance  4. Introduction to Energy Inference  5. Goodness-of-Fit  6. Testing Multivariate Normality  7. Eigenvalues for One-Sample E-Statistics  8. Generalized Goodness-of-Fit  9. Multi-sample Energy Statistics  10. Energy in Metric Spaces and Other Distances  Part 2: Distance Correlation and Dependence  11. On Correlation and Other Measures of Association  12. Distance Correlation  13. Testing Independence  14. Applications and Extensions  15. Brownian Distance Covariance  16. U-statistics and Unbiased dCov2  17. Partial Distance Correlation  18. The Numerical Value of dCor  19. The dCor t-test of Independence in High Dimension  20. Computational Algorithms  21. Time Series and Distance Correlation  22. Axioms of Dependence Measures  23. Earth Mover's Correlation  24. Appendix A: Historical Background  25. Appendix B: Prehistory

Biography

Gábor J. Székely graduated from Eötvös Loránd University, Budapest, Hungary (ELTE) with MS in 1970, and Ph. D. in 1971. He joined the Department of Probability Theory of ELTE in 1970. In 1989 he became the funding chair of the Department of Stochastics of the Budapest Institute of Technology (Technical University of Budapest). In 1995 Székely moved to the US. Before that, in 1990-91 he was the first distinguished Lukacs Professor at Bowling Green State University, Ohio. Székely had several visiting positions, e.g., at the University of Amsterdam in 1976 and at Yale University in 1989. Between 1985 and 1995 he was the first Hungarian director of Budapest Semesters in Mathematics. Between 2006 and 2022, until his retirement, he was program director of statistics of the National Science Foundation (USA). Székely has almost 250 publications, including six books in several languages. In 1988 he received the Rollo Davidson Prize from Cambridge University, jointly with Imre Z. Ruzsa for their work on algebraic probability theory. In 2010 Székely became an elected fellow of the Institute of Mathematical Statistics for his seminal work on physics concepts in statistics like energy statistics and distance correlation. Székely was invited speaker at several Joint Statistics Meetings and also organizer of invited sessions on energy statistics and distance correlation. Székely has two children, Szilvia and Tamás, and six grandchildren: Elisa, Anna, Michaël and Lea, Eszter, Avi who live in Brussels, Belgium and Basel, Switzerland. Székely and his wife, Judit, live in McLean, Virginia and Budapest, Hungary.

Maria L. Rizzo is Professor in the Department of Mathematics and Statistics at Bowling Green State University in Bowling Green, Ohio, where she teaches statistics, actuarial science, computational statistics, statistical programming and data science. Prior to joining the faculty at BGSU in 2006, she was a faculty member of the Department of Mathematics at Ohio University in Athens, Ohio. Her main research area is energy statistics and distance correlation. She is the software developer and maintainer of the energy package for R, and author of textbooks on statistical computing: "Statistical Computing with R" 1st and 2nd editions, "R by Example" (2nd edition in progress) with Jim Albert, and a forthcoming textbook on data science. Dr. Rizzo has eight PhD students and one current student, almost all with dissertations on energy statistics. Outside of work she enjoys spending time with her family including her husband, daughters, grandchildren and a large extended family.

"Many dozens of theorems are proved, various R codes with numerical examples are provided, and multiple exercises are given in each chapter... The book and corresponding software can be useful for instructors and students in advanced statistical courses, and for researchers and practitioners in data analysis."

Stan LipovetskyIpsos, Technometrics, 22nd August 2023.