$32.00

#
Understanding Statistics and Statistical Myths

How to Become a Profound Learner

## Preview

## Book Description

Addressing 30 statistical myths in the areas of data, estimation, measurement system analysis, capability, hypothesis testing, statistical inference, and control charts, this book explains how to understand statistics rather than how to do statistics. Every statistical myth listed in this book has been stated in course materials used by the author’s clients, by employers, or by experts in training thousands.

Each myth is an unconditional statement that, when taken literally and at face value, is false. All are false under some conditions while a few are not true under any condition. This book explores the conditions that render false the universality of the statements to help you understand why.

In the book, six characters discuss various topics taught in a fictional course intended to teach students how to apply statistics to improve processes. The reader follows along and learns as the students apply what they learn to a project in which they are team members.

Each discussion is like a Platonic dialogue. The purpose of a Platonic dialogue is to analyze a concept, statement, hypothesis, or theory through questions, applications, examples, and counterexamples, to see if it is true, when it is true, and why it is true when it is true. The dialogues will help readers understand why certain statements are not always true under all conditions, as well as when they contradict other myths.

## Table of Contents

**Myth 1: Two Types of Data—Attribute/Discrete and Measurement/Continuous**Background

Measurement Requires Scale

Gauges or Instruments vs. No Gauges

Discrete, Categorical, Attribute versus Continuous, Variable: Degree of Information

Creating Continuous Measures by Changing the "Thing" Measured

Discrete versus Continuous: Half Test

Nominal, Ordinal, Interval, Ratio

Measurement to Compare

Scale Type versus Data Type

Scale Taxonomy

Purpose of Data Classification

**Myth 2: Proportions and Percentages Are Discrete Data**

Background

Denominator for Proportions and Percentages

Probabilities

Classification of Proportions, Percentages, and Probabilities

**Myth 3:**

**s = √[Σ(X**

_{i}- x)^{2}/(n- 1)]**The Correct Formula for Sample Standard Deviation**

Background

Correctness of Estimations

Estimators and Estimates

Properties of Estimators

Myth 4: Sample Standard Deviation

Myth 4: Sample Standard Deviation

**√[Σ(X**

_{i}-x)^{2}/(n- 1)]**Is Unbiased**

Background

Degrees of Freedom

*t*Distribution

Definition of Bias

Removing Bias and Control Charts

**Myth 5: Variances Can Be Added but Not Standard Deviations**

Background

Sums of Squares and Square Roots: Pythagorean Theorem

Functions and Operators

Random Variables

Independence of Factors

Other Properties

**Myth 6: Parts and Operators for an MSA Do Not Have to Be Randomly Selected**

Background

Types of Analyses of Variance

Making Measurement System Look Better than It Is: Selecting Parts to Cover the Range of Process Variation

Selecting Both Good and Bad Parts

**Myth 7: % Study (% Contribution, Number of Distinct Categories) Is the Best Criterion for Evaluating a Measurement System for Process Improvement**

Background

% Contribution versus % Study

*P*/

*T*Ratio versus % Study

Distinguishing between Good and Bad Parts

Distinguishing Parts That Are Different

**Myth 8: Only Sigma Can Compare Different Processes and Metrics**

Background

Sigma and Specifications

Sigma as a Percentage

**Myth 9: Capability Is Not Percent/Proportion of Good Units**

Background

Capability Indices: Frequency Meeting Specifications

Capability: Actual versus Potential

Capability Indices

Process Capability Time-Dependent

Meaning of Capability: Short-Cut Calculations

**Myth 10: p = Probability of Making an Error**

Background

Only Two Types of Errors

Definition of an Error about Deciding What Is True

Calculation of

*p*and Evidence for a Hypothesis

Probability of Making an Error for a Particular Case

Probability of Data Given Ho versus Probability of Ho Given Data

Non-probabilistic Decisions

**Myth 11: Need More Data for Discrete Data than Continuous Data Analysis**

Background

Discrete Examples When

*n*= 1

Factors That Determine Sample Size

Relevancy of Data

**Myth 12: Nonparametric Tests Are Less Powerful than Parametric Tests**

Background

Distribution Free versus Nonparametric

Comparing Power for the Same Conditions

Different Formulas for Testing the Same Hypotheses

Assumptions of Tests

Comparing Power for the Same Characteristic

Converting Quantitative Data to Qualitative Data

**Myth 13: Sample Size of 30 Is Acceptable (for Statistical Significance)**

Background

A Rationale for

*n*= 30

Contradictory Rules of Thumb

Uses of Data

Sample Size as a Function of Alpha, Beta, Delta, and Sigma

Sample Size for Practical Use

Sample Size and Statistical Significance

**Myth 14: Can Only Fail to Reject H**Background

_{o}, Can Never Accept H_{o}

Proving Theories: Sufficient versus Necessary

Prove versus Accept versus Fail to Reject: Actions

Innocent versus Guilty: Problems with Example

Two-Choice Testing

Significance Testing and Confidence Intervals

Hypothesis Testing and Power

Null Hypothesis of ≥ or ≤

Practical Cases

Which Hypothesis Has the Equal Sign?

Bayesian Statistics: Probability of Hypothesis

**Myth 15: Control Limits Are ±3 Standard Deviations from the Center Line**

Background

Standard Error versus Standard Deviation

Within- versus between-Subgroup Variation: How Control Charts Work

*I*Chart of Individuals

**Myth 16: Control Chart Limits Are Empirical Limits**

Background

Definition of Empirical

Empirical Limits versus Limits Justified Empirically

Shewhart’s Evidence of Limits Being Empirical

Wheeler’s Empirical Rule

Empirical Justification for a Purpose

**Myth 17: Control Chart Limits Are Not Probability Limits**

Background

Association of Probabilities and Control Chart Limits

Can Control Limits Be Probability Limits?

False Alarm Rates for All Special Cause Patterns

Wheeler Uses Probability Limits

Other Uses of Probability Limits

**Myth 18: ±3 Sigma Limits Are the Most Economical Control Chart Limits**

Background

Evidence for 3–Standard Error Limits Being Economically Best

Evidence against 3–Standard Error Limits Being the Best Economically

Counterexamples: Simple Cost Model Other Out-of-Control Rules—Assignable Causes Shewhart Didn’t Find but Exist

Small Changes Are Not Critical to Detect versus Taguchi’s Loss Function

Importance of Subgroup Size and Frequency on Economic Value of Control Chart Limits

Purpose to Detect Lack of Control—3–Standard Error Limits Misplaced

**Myth 19: Statistical Inferences Are Inductive Inferences**

Background

Reasoning: Validity and Soundness

Induction versus Deduction

Four Cases of Inductive Inferences

Statistical Inferences: Probability Distributions

Inferences about Population Parameters

Deductive Statistical Inferences: Hypothesis Testing

Deductive Statistical Inferences: Estimation

Real-World Cases of Statistical Inferences

**Myth 20: There Is One Universe or Population If Data Are Homogeneous**

Background

Definition of Homogeneous

Is Displaying Stability Required for Universes to Exist?

Are There Always Multiple Universes If Data Display Instability?

Is There Only One Universe If Data Appropriately Plotted Display Stability?

Control Chart Framework: Valid and Invalid Conclusions

**Myth 21: Control Charts Are Analytic Studies**

Background

Enumerative versus Analytic Distinguishing Characteristics

Enumerative Problem, Study, and Solution

Analytic Problem, Study, and Solution

Procedures for Enumerative and Analytic Studies

Are Control Charts Enumerative or Analytic Studies?

Cause–Effect Relationship

An Analytic Study Answers "When?"

**Myth 22: Control Charts Are Not Tests of Hypotheses**

Background

Definition and Structure of Hypothesis Test

Control Chart as a General Hypothesis Test

Statistical Hypothesis Testing: Alpha and

*p*

Analysis of Means

Shewhart’s View on Control Charts as Tests of Hypotheses

Deming’s Argument: No Definable, Finite, Static Population

Woodall’s Two Phases of Control Chart Use

Finite, Static Universe

Control Charts as Nonparametric Tests of Hypotheses

Utility of Viewing Control Charts as Statistical Hypothesis Tests

Is the Process in Control? versus What Is the Probability the Process Changed?

**Myth 23: Process Needs to Be Stable to Calculate Process Capability**

Background

Stability and Capability: Dependent or Independent?

Actual Performance and Potential Capability versus Stability

Process Capability: Reliability of Estimates

Control Charts Are Fallible

Capable: 100% or Less than 100% Meeting Specifications

Process Capability: "Best" Performance versus Sustainability

*C*p versus

*P*/

*T*

Random Sampling

Response Surface Studies

**Myth 24: Specifications Don’t Belong on Control Charts**

Background

Run Charts

Charts of Individual Values

Confusion Having Both Control and Specification Limits on Charts

Stability, Performance, and Capability

Specifications on Averages and Variation

**Myth 25: Identify and Eliminate Assignable Causes of Variation**

Background

Assignable Causes versus Process Change

Is Increase in Process Variation Always Bad?

Good Assignable Causes

**Myth 26: Process Needs to Be Stable before You Can Improve It**

Background

History of Improvement before the 1920s

Control Chart Fallibility

Stabilizing a Process and Improving It

Stability Required versus Four States of a Process

Shewhart’s Counterexample

**Myth 27: Stability (Homogeneity) Is Required to Establish a Baseline**

Background

Purpose of Baseline

Just-Do-It Projects

Natural Processes

Processes Whose Output We Want to Be "Out of Control"

Meaning of "Meaningless"

Daily Comparisons

"True" Process Average: Process, Outputs, Characteristics, and Measures

Ways to Compare

Universe or Population and Descriptive Statistics

Random Sampling

When Is Homogeneity/Stability Not Required or Unimportant?

**Myth 28: A Process Must Be Stable to Be Predictable**

Background

Types of Predictions: Interpolation and Extrapolation

Interpolation: Stability versus Instability

Conditional Predictions

Extrapolation: Stability versus Instability

Fallibility of Control Chart Stability

Control Charts in Daily Life

Statistical versus Causal Control

**Myth 29: Adjusting a Process Based on a Single Defect Is Tampering, Causing Increased Process Variation**

Background

Definition of Tampering Zero versus One versus Multiple Defects to Define Tampering

Role of Theory and Understanding When Adjusting

Defects Arise from Special Causes: Anomalies

Control Limits versus Specification Limits

Actions for Common Cause Signals versus Special Cause Signals

Is Reducing Common Cause Variation Always Good?

Fundamental Change versus Tampering

Funnel Exercise: Counterexample

**Myth 30: No Assumptions Required When the Data Speak for Themselves**

Background

Simpson’s Paradox

Math and Descriptive Statistics: Adding versus Aggregating

Inferences versus Facts: Conditions for Paradoxes

Assumptions for Modeling

Assumptions for Causal Inferences

Assumptions for Inferences from Reasons

**Epilogue**

**References**

**Index**

## Author(s)

### Biography

**Kicab Castaneda-Mendez**, founder of Process Excellence Consultants, Chapel Hill, NC, provides consulting and training on operational excellence using lean Six Sigma methodologies, balanced scorecard and Baldrige framework.