Understanding Statistics and Statistical Myths: How to Become a Profound Learner, 1st Edition (Hardback) book cover

Understanding Statistics and Statistical Myths

How to Become a Profound Learner, 1st Edition

By Kicab Castaneda-Mendez

CRC Press

537 pages | 37 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781498727457
pub: 2015-11-18
SAVE ~$17.39
$86.95
$69.56
x
eBook (VitalSource) : 9780429257995
pub: 2015-11-18
from $43.48


FREE Standard Shipping!

Description

Addressing 30 statistical myths in the areas of data, estimation, measurement system analysis, capability, hypothesis testing, statistical inference, and control charts, this book explains how to understand statistics rather than how to do statistics. Every statistical myth listed in this book has been stated in course materials used by the author’s clients, by employers, or by experts in training thousands.

Each myth is an unconditional statement that, when taken literally and at face value, is false. All are false under some conditions while a few are not true under any condition. This book explores the conditions that render false the universality of the statements to help you understand why.

In the book, six characters discuss various topics taught in a fictional course intended to teach students how to apply statistics to improve processes. The reader follows along and learns as the students apply what they learn to a project in which they are team members.

Each discussion is like a Platonic dialogue. The purpose of a Platonic dialogue is to analyze a concept, statement, hypothesis, or theory through questions, applications, examples, and counterexamples, to see if it is true, when it is true, and why it is true when it is true. The dialogues will help readers understand why certain statements are not always true under all conditions, as well as when they contradict other myths.

Table of Contents

Myth 1: Two Types of Data—Attribute/Discrete and Measurement/Continuous

Background

Measurement Requires Scale

Gauges or Instruments vs. No Gauges

Discrete, Categorical, Attribute versus Continuous, Variable: Degree of Information

Creating Continuous Measures by Changing the "Thing" Measured

Discrete versus Continuous: Half Test

Nominal, Ordinal, Interval, Ratio

Measurement to Compare

Scale Type versus Data Type

Scale Taxonomy

Purpose of Data Classification

Myth 2: Proportions and Percentages Are Discrete Data

Background

Denominator for Proportions and Percentages

Probabilities

Classification of Proportions, Percentages, and Probabilities

Myth 3: s = √[Σ(Xi- x)2/(n- 1)] The Correct Formula for Sample Standard Deviation

Background

Correctness of Estimations

Estimators and Estimates

Properties of Estimators

Myth 4: Sample Standard Deviation √[Σ(Xi-x)2/(n- 1)] Is Unbiased

Background

Degrees of Freedom

t Distribution

Definition of Bias

Removing Bias and Control Charts

Myth 5: Variances Can Be Added but Not Standard Deviations

Background

Sums of Squares and Square Roots: Pythagorean Theorem

Functions and Operators

Random Variables

Independence of Factors

Other Properties

Myth 6: Parts and Operators for an MSA Do Not Have to Be Randomly Selected

Background

Types of Analyses of Variance

Making Measurement System Look Better than It Is: Selecting Parts to Cover the Range of Process Variation

Selecting Both Good and Bad Parts

Myth 7: % Study (% Contribution, Number of Distinct Categories) Is the Best Criterion for Evaluating a Measurement System for Process Improvement

Background

% Contribution versus % Study

P/T Ratio versus % Study

Distinguishing between Good and Bad Parts

Distinguishing Parts That Are Different

Myth 8: Only Sigma Can Compare Different Processes and Metrics

Background

Sigma and Specifications

Sigma as a Percentage

Myth 9: Capability Is Not Percent/Proportion of Good Units

Background

Capability Indices: Frequency Meeting Specifications

Capability: Actual versus Potential

Capability Indices

Process Capability Time-Dependent

Meaning of Capability: Short-Cut Calculations

Myth 10: p = Probability of Making an Error

Background

Only Two Types of Errors

Definition of an Error about Deciding What Is True

Calculation of p and Evidence for a Hypothesis

Probability of Making an Error for a Particular Case

Probability of Data Given Ho versus Probability of Ho Given Data

Non-probabilistic Decisions

Myth 11: Need More Data for Discrete Data than Continuous Data Analysis

Background

Discrete Examples When n = 1

Factors That Determine Sample Size

Relevancy of Data

Myth 12: Nonparametric Tests Are Less Powerful than Parametric Tests

Background

Distribution Free versus Nonparametric

Comparing Power for the Same Conditions

Different Formulas for Testing the Same Hypotheses

Assumptions of Tests

Comparing Power for the Same Characteristic

Converting Quantitative Data to Qualitative Data

Myth 13: Sample Size of 30 Is Acceptable (for Statistical Significance)

Background

A Rationale for n = 30

Contradictory Rules of Thumb

Uses of Data

Sample Size as a Function of Alpha, Beta, Delta, and Sigma

Sample Size for Practical Use

Sample Size and Statistical Significance

Myth 14: Can Only Fail to Reject Ho, Can Never Accept Ho

Background

Proving Theories: Sufficient versus Necessary

Prove versus Accept versus Fail to Reject: Actions

Innocent versus Guilty: Problems with Example

Two-Choice Testing

Significance Testing and Confidence Intervals

Hypothesis Testing and Power

Null Hypothesis of ≥ or ≤

Practical Cases

Which Hypothesis Has the Equal Sign?

Bayesian Statistics: Probability of Hypothesis

Myth 15: Control Limits Are ±3 Standard Deviations from the Center Line

Background

Standard Error versus Standard Deviation

Within- versus between-Subgroup Variation: How Control Charts Work

I Chart of Individuals

Myth 16: Control Chart Limits Are Empirical Limits

Background

Definition of Empirical

Empirical Limits versus Limits Justified Empirically

Shewhart’s Evidence of Limits Being Empirical

Wheeler’s Empirical Rule

Empirical Justification for a Purpose

Myth 17: Control Chart Limits Are Not Probability Limits

Background

Association of Probabilities and Control Chart Limits

Can Control Limits Be Probability Limits?

False Alarm Rates for All Special Cause Patterns

Wheeler Uses Probability Limits

Other Uses of Probability Limits

Myth 18: ±3 Sigma Limits Are the Most Economical Control Chart Limits

Background

Evidence for 3–Standard Error Limits Being Economically Best

Evidence against 3–Standard Error Limits Being the Best Economically

Counterexamples: Simple Cost Model Other Out-of-Control Rules—Assignable Causes Shewhart Didn’t Find but Exist

Small Changes Are Not Critical to Detect versus Taguchi’s Loss Function

Importance of Subgroup Size and Frequency on Economic Value of Control Chart Limits

Purpose to Detect Lack of Control—3–Standard Error Limits Misplaced

Myth 19: Statistical Inferences Are Inductive Inferences

Background

Reasoning: Validity and Soundness

Induction versus Deduction

Four Cases of Inductive Inferences

Statistical Inferences: Probability Distributions

Inferences about Population Parameters

Deductive Statistical Inferences: Hypothesis Testing

Deductive Statistical Inferences: Estimation

Real-World Cases of Statistical Inferences

Myth 20: There Is One Universe or Population If Data Are Homogeneous

Background

Definition of Homogeneous

Is Displaying Stability Required for Universes to Exist?

Are There Always Multiple Universes If Data Display Instability?

Is There Only One Universe If Data Appropriately Plotted Display Stability?

Control Chart Framework: Valid and Invalid Conclusions

Myth 21: Control Charts Are Analytic Studies

Background

Enumerative versus Analytic Distinguishing Characteristics

Enumerative Problem, Study, and Solution

Analytic Problem, Study, and Solution

Procedures for Enumerative and Analytic Studies

Are Control Charts Enumerative or Analytic Studies?

Cause–Effect Relationship

An Analytic Study Answers "When?"

Myth 22: Control Charts Are Not Tests of Hypotheses

Background

Definition and Structure of Hypothesis Test

Control Chart as a General Hypothesis Test

Statistical Hypothesis Testing: Alpha and p

Analysis of Means

Shewhart’s View on Control Charts as Tests of Hypotheses

Deming’s Argument: No Definable, Finite, Static Population

Woodall’s Two Phases of Control Chart Use

Finite, Static Universe

Control Charts as Nonparametric Tests of Hypotheses

Utility of Viewing Control Charts as Statistical Hypothesis Tests

Is the Process in Control? versus What Is the Probability the Process Changed?

Myth 23: Process Needs to Be Stable to Calculate Process Capability

Background

Stability and Capability: Dependent or Independent?

Actual Performance and Potential Capability versus Stability

Process Capability: Reliability of Estimates

Control Charts Are Fallible

Capable: 100% or Less than 100% Meeting Specifications

Process Capability: "Best" Performance versus Sustainability

Cp versus P/T

Random Sampling

Response Surface Studies

Myth 24: Specifications Don’t Belong on Control Charts

Background

Run Charts

Charts of Individual Values

Confusion Having Both Control and Specification Limits on Charts

Stability, Performance, and Capability

Specifications on Averages and Variation

Myth 25: Identify and Eliminate Assignable Causes of Variation

Background

Assignable Causes versus Process Change

Is Increase in Process Variation Always Bad?

Good Assignable Causes

Myth 26: Process Needs to Be Stable before You Can Improve It

Background

History of Improvement before the 1920s

Control Chart Fallibility

Stabilizing a Process and Improving It

Stability Required versus Four States of a Process

Shewhart’s Counterexample

Myth 27: Stability (Homogeneity) Is Required to Establish a Baseline

Background

Purpose of Baseline

Just-Do-It Projects

Natural Processes

Processes Whose Output We Want to Be "Out of Control"

Meaning of "Meaningless"

Daily Comparisons

"True" Process Average: Process, Outputs, Characteristics, and Measures

Ways to Compare

Universe or Population and Descriptive Statistics

Random Sampling

When Is Homogeneity/Stability Not Required or Unimportant?

Myth 28: A Process Must Be Stable to Be Predictable

Background

Types of Predictions: Interpolation and Extrapolation

Interpolation: Stability versus Instability

Conditional Predictions

Extrapolation: Stability versus Instability

Fallibility of Control Chart Stability

Control Charts in Daily Life

Statistical versus Causal Control

Myth 29: Adjusting a Process Based on a Single Defect Is Tampering, Causing Increased Process Variation

Background

Definition of Tampering Zero versus One versus Multiple Defects to Define Tampering

Role of Theory and Understanding When Adjusting

Defects Arise from Special Causes: Anomalies

Control Limits versus Specification Limits

Actions for Common Cause Signals versus Special Cause Signals

Is Reducing Common Cause Variation Always Good?

Fundamental Change versus Tampering

Funnel Exercise: Counterexample

Myth 30: No Assumptions Required When the Data Speak for Themselves

Background

Simpson’s Paradox

Math and Descriptive Statistics: Adding versus Aggregating

Inferences versus Facts: Conditions for Paradoxes

Assumptions for Modeling

Assumptions for Causal Inferences

Assumptions for Inferences from Reasons

Epilogue

References

Index

About the Author

Kicab Castaneda-Mendez, founder of Process Excellence Consultants, Chapel Hill, NC, provides consulting and training on operational excellence using lean Six Sigma methodologies, balanced scorecard and Baldrige framework.

Subject Categories

BISAC Subject Codes/Headings:
BUS000000
BUSINESS & ECONOMICS / General
TEC020000
TECHNOLOGY & ENGINEERING / Manufacturing