1st Edition

# Understanding Advanced Statistical Methods

**Also available as eBook on:**

Providing a much-needed bridge between elementary statistics courses and advanced research methods courses, **Understanding Advanced Statistical Methods** helps students grasp the fundamental assumptions and machinery behind sophisticated statistical topics, such as logistic regression, maximum likelihood, bootstrapping, nonparametrics, and Bayesian methods. The book teaches students how to properly model, think critically, and design their own studies to avoid common errors. It leads them to think differently not only about math and statistics but also about general research and the scientific method.

With a focus on statistical models as *producers* of data, the book enables students to more easily understand the machinery of advanced statistics. It also downplays the "population" interpretation of statistical models and presents Bayesian methods before frequentist ones. Requiring no prior calculus experience, the text employs a "just-in-time" approach that introduces mathematical topics, including calculus, where needed. Formulas throughout the text are used to explain why calculus and probability are essential in statistical modeling. The authors also intuitively explain the theory and logic behind real data analysis, incorporating a range of application examples from the social, economic, biological, medical, physical, and engineering sciences.

Enabling your students to answer the *why* behind statistical methods, this text teaches them how to successfully draw conclusions when the premises are flawed. It empowers them to use advanced statistical methods with confidence and develop their own statistical recipes. Ancillary materials are available on the book’s website.

**Introduction: Probability, Statistics, and Science**Reality, Nature, Science, and Models

Statistical Processes: Nature, Design and Measurement, and Data

Models

Deterministic Models

Variability

Parameters

Purely Probabilistic Statistical Models

Statistical Models with Both Deterministic and Probabilistic Components

Statistical Inference

Good and Bad Models

Uses of Probability Models

**Random Variables and Their Probability Distributions**

Introduction

Types of Random Variables: Nominal, Ordinal, and Continuous

Discrete Probability Distribution Functions

Continuous Probability Distribution Functions

Some Calculus–Derivatives and Least Squares

More Calculus–Integrals and Cumulative Distribution Functions

**Probability Calculation and Simulation**

Introduction

Analytic Calculations, Discrete and Continuous Cases

Simulation-Based Approximation

Generating Random Numbers

**Identifying Distributions**

Introduction

Identifying Distributions from Theory Alone

Using Data: Estimating Distributions via the Histogram

Quantiles: Theoretical and Data-Based Estimates

Using Data: Comparing Distributions via the Quantile–Quantile Plot

Effect of Randomness on Histograms and *q*–*q *Plots

**Conditional Distributions and Independence**

Introduction

Conditional Discrete Distributions

Estimating Conditional Discrete Distributions

Conditional Continuous Distributions

Estimating Conditional Continuous Distributions

Independence

**Marginal Distributions, Joint Distributions, Independence, and Bayes’ Theorem**Introduction

Joint and Marginal Distributions

Estimating and Visualizing Joint Distributions

Conditional Distributions from Joint Distributions

Joint Distributions When Variables Are Independent

Bayes’ Theorem

**Sampling from Populations and Processes**Introduction

Sampling from Populations

Critique of the Population Interpretation of Probability Models

The Process Model versus the Population Model

Independent and Identically Distributed Random Variables and Other Models

Checking the iid Assumption

**Expected Value and the Law of Large Numbers**

Introduction

Discrete Case

Continuous Case

Law of Large Numbers

Law of Large Numbers for the Bernoulli Distribution

Keeping the Terminology Straight: Mean, Average, Sample Mean, Sample Average, and Expected Value

Bootstrap Distribution and the Plug-In Principle

**Functions of Random Variables: Their Distributions and Expected Values**

Introduction

Distributions of Functions: The Discrete Case

Distributions of Functions: The Continuous Case

Expected Values of Functions and the Law of the Unconscious Statistician

Linearity and Additivity Properties

Nonlinear Functions and Jensen’s Inequality

Variance

Standard Deviation, Mean Absolute Deviation, and Chebyshev’s Inequality

Linearity Property of Variance

Skewness and Kurtosis

**Distributions of Totals**

Introduction

Additivity Property of Variance

Covariance and Correlation

Central Limit Theorem

**Estimation: Unbiasedness, Consistency, and Efficiency**

Introduction

Biased and Unbiased Estimators

Bias of the Plug-In Estimator of Variance

Removing the Bias of the Plug-In Estimator of Variance

The Joke Is on Us: The Standard Deviation Estimator Is Biased after All

Consistency of Estimators

Efficiency of Estimators

**Likelihood Function and Maximum Likelihood Estimates**

Introduction

Likelihood Function

Maximum Likelihood Estimates

Wald Standard Error

**Bayesian Statistics**

Introduction: Play a Game with Hans!

Prior Information and Posterior Knowledge

Case of the Unknown Survey

Bayesian Statistics: The Overview

Bayesian Analysis of the Bernoulli Parameter

Bayesian Analysis Using Simulation

What Good Is Bayes?

**Frequentist Statistical Methods**

Introduction

Large-Sample Approximate Frequentist Confidence Interval for the Process Mean

What Does *Approximate *Really Mean for an Interval Range?

Comparing the Bayesian and Frequentist Paradigms

**Are Your Results Explainable by Chance Alone?**

Introduction

What Does *by Chance Alone *Mean?

The *p*-Value

The Extremely Ugly "*pv *≤ 0.05" Rule of Thumb

**Chi-Squared, Student’s t, and F-Distributions, with Applications**

Introduction

Linearity and Additivity Properties of the Normal Distribution

Effect of Using an Estimate of

*s*

Chi-Squared Distribution

Frequentist Confidence Interval for

*s*

Student’s

*t*-Distribution

Comparing Two Independent Samples Using a Confidence Interval

Comparing Two Independent Homoscedastic Normal Samples via Hypothesis Testing

*F*-Distribution and ANOVA Test

*F*-Distribution and Comparing Variances of Two Independent Groups

**Likelihood Ratio Tests**

Introduction

Likelihood Ratio Method for Constructing Test Statistics

Evaluating the Statistical Significance of Likelihood Ratio Test Statistics

Likelihood Ratio Goodness-of-Fit Tests

Cross-Classification Frequency Tables and Tests of Independence

Comparing Non-Nested Models via the AIC Statistic

**Sample Size and Power**

Introduction

Choosing a Sample Size for a Prespecified Accuracy Margin

Power

Noncentral Distributions

Choosing a Sample Size for Prespecified Power

Post Hoc Power: A Useless Statistic

**Robustness and Nonparametric Methods**

Introduction

Nonparametric Tests Based on the Rank Transformation

Randomization Tests

Level and Power Robustness

Bootstrap Percentile-*t *Confidence Interval

Final Words

Index

*Vocabulary, Formula Summaries, and Exercises appear at the end of each chapter.*

### Biography

**Peter H. Westfall** is the Paul Whitfield Horn Professor of Statistics and James Niver Professor of Information Systems and Quantitative Sciences at Texas Tech University. A Fellow of the ASA and the AAAS, Dr. Westfall has published several books and over 100 papers on statistical theory and methods. He also has won several teaching awards and is the former editor of *The American Statistician*. He earned a PhD in statistics from the University of California, Davis.

**Kevin S.S. Henning **is a clinical assistant professor of business analysis in the Department of Economics and International Business at Sam Houston State University, where he teaches business statistics and forecasting. He earned a PhD in business statistics from Texas Tech University.

"This nicely written textbook fills the gap between elementary statistics courses and more advanced research methods courses. The book helps one to grasp the key assumptions and machinery behind advanced statistical topics … Each chapter ends with useful exercises."

—Mathematical Reviews, August 2014"… full of interesting insights and excellent examples and explanations for essential basic statistical concepts. The use of thought experiments; the detailed algebraic developments of proofs; and the explanations of frequentist and Bayesian statistics, confidence intervals, hypothesis testing, and so on, are all first rate. … a solid teaching resource."

—Australian & New Zealand Journal of Statistics, 2014"… useful as a prerequisite for advanced study of statistical analysis, such as regression, experimental design, survival analysis, and categorical data analysis … examples in this book seem very useful and may help expand the view of newcomers to statistics."

—Biometrics, June 2014"This book contains just as many formulas as other statistics texts, but with intuitive, engaging, insightful, and irreverent explanations … the authors strive mightily to part the curtain that hides the fundamentals of statistical thinking from most students. … The book has 20 chapters that cover the usual topics, and more, in an undergraduate/graduate math stat text; it is suitable for a fast-paced semester course offered to serious students. The ‘and more’ refers to the strong emphasis throughout the book on thoughtful applications in a wide variety of disciplines. … The coverage of mathematical statistics is extensive and benefits from a substantial effort by the authors to explain the intuition motivating the procedures and the correct interpretation of specific results. … A companion Web site has a wealth of material useful for the instructor and students. … the text represents a successful effort by the authors to advance and improve the statistics education paradigm for courses offered to upper-level undergraduate and graduate students."

—The American Statistician, May 2014"There is a gap between elementary statistics courses and advanced research techniques. This gap is reflected by difficulties in linking statistical theory with its application in the real world. This book is an ideal way to overcome this problem. …

The main advantage of this book is the possibility to achieve advanced research skills. The theory behind data analysis is well explained, using plenty of real examples from social, economic, medical, physical and engineering sciences. The theory and application are well balanced and very well linked. All examples are illustrated in MS Excel.

This book helps to teach students to explore statistics more deeply, avoiding the typical trap of students learning little about the applications of what they are studying and why they are doing it. I think this book will be very useful in the sense that students will be forced to think differently about things, not only about math and statistics, but also about research and the scientific method.

The reviewer enjoyed reading the book and it is worth emphasising its usefulness for teachers, students and researchers."

—Božidar V. Popović,Journal of Applied Statistics, 2014"The book covers the content of a typical undergraduate math stat text, but with much more thought to application than a typical text. It appears to be close to Rice’s text (

Mathematical Statistics and Data Analysis) in spirit and level, but perhaps comes closer to that spirit than Rice’s. It would be worth considering for a course using Rice. I also recommend it as a reference for anyone teaching applied statistics."

—Martha K. Smith, Professor Emerita of Mathematics, University of Texas at Austin"I work with scientists who are pioneers in their fields and their ignorance of statistical concepts never ceases to amaze me. I believe most of this can be traced to the way we teach statistics to non-statisticians: as a bag of tools rather than a systematic way to think about data collection and analysis. This book is unique in the way it approaches this topic. It does not subscribe to the cookbook template of teaching statistics but focuses instead on understanding the distinction between the observed data and the mechanisms that generated it. This focus allows a better distinction between models, parameters, and estimates and should help pave a way to instill statistical thinking to undergraduate students."

—Mithat Gönen, Memorial Sloan-Kettering Cancer Center"

Understanding Advanced Statistical Methodsis an excellent source for the curious student. The book introduces a novel approach to learning statistics by providing comprehensive coverage of concepts in a captivating framework. Students are not only encouraged to understand the intuition and structure behind the concepts, but also motivated to think seriously about the pertinent questions before they ask. Therefore, the book strives to build a solid background in fundamental concepts and to equip students with the necessary skills so that they can expand their toolbox in their future endeavors. The book will no doubt be the standard reference in advanced statistics courses and bring about profound changes in how statistics should be taught."

—Ozzy Akay, Assistant Professor, Texas Tech University"Don't let the authors' exuberant and iconoclastic style fool you into thinking that this book is not a serious text. It definitely is. The style has a purpose—to romp around the field's sacred cows and show the reader as quickly as possible the real working principles behind how statistical methods are developed and some of the methods’ most important applications. In that sense, the subject of the book truly is theoretical statistics, but both the motivation and the presentation are so thoroughly grounded in practice that many readers will see it as a practical guide. But the authors don’t intend for it to be a statistical cheat sheet: each of their many engaging and illuminating examples points forward to more that could be studied, and invites readers to pursue those studies. This isn’t the last statistics textbook students will ever need, but it should be the first."

—Randy Tobias, Director, Linear Models R&D, SAS Institute Inc.