$31.00

#
Modern Statistics for the Social and Behavioral Sciences

A Practical Introduction, Second Edition

## Preview

## Book Description

Requiring no prior training, *Modern Statistics for the Social and Behavioral Sciences* provides a two-semester, graduate-level introduction to basic statistical techniques that takes into account recent advances and insights that are typically ignored in an introductory course.

Hundreds of journal articles make it clear that basic techniques, routinely taught and used, can perform poorly when dealing with skewed distributions, outliers, heteroscedasticity (unequal variances) and curvature. Methods for dealing with these concerns have been derived and can provide a deeper, more accurate and more nuanced understanding of data. A conceptual basis is provided for understanding when and why standard methods can have poor power and yield misleading measures of effect size. Modern techniques for dealing with known concerns are described and illustrated.

Features:

- Presents an in-depth description of both classic and modern methods
- Explains and illustrates why recent advances can provide more power and a deeper understanding of data
- Provides numerous illustrations using the software R
- Includes an R package with over 1300 functions
- Includes a solution manual giving detailed answers to all of the exercises

This second edition describes many recent advances relevant to basic techniques. For example, a vast array of new and improved methods is now available for dealing with regression, including substantially improved ANCOVA techniques. The coverage of multiple comparison procedures has been expanded and new ANOVA techniques are described.

**Rand Wilcox** is a professor of psychology at the University of Southern California. He is the author of 13 other statistics books and the creator of the R package WRS. He currently serves as an associate editor for five statistics journals. He is a fellow of the Association for Psychological Science and an elected member of the International Statistical Institute.

## Table of Contents

**INTRODUCTION**

SAMPLES VERSUS POPULATIONS

SOFTWARE

R BASICS

Entering Data

R Functions and Packages

Data Sets

Arithmetic Operations

NUMERICAL AND GRAPHICAL SUMMARIES OF DATA

BASIC SUMMATION NOTATION

MEASURES OF LOCATION

The Sample Mean

R Function Mean

The Sample Median

R Function for the Median

A CRITICISM OF THE MEDIAN: IT MIGHT TRIM TOO MANY VALUES

R Function for the Tr

R Function winmean

What is a Measure of Location?

MEASURES OF VARIATION OR SCALE

Sample Variance and Standard Deviation

R Functions var and sd

The Interquartile Range

R Functions idealf and ideafIQR

Winsorized Variance

R Function winvar

Median Absolute Deviation

R Function mad

Average Absolute Distance from the Median

Other Robust Measures of Variation

R Functions bivar, pbvar, tauvar, and tbs

DETECTING OUTLIERS

A Method Based on the Mean and Variance

A Better Outlier Detection Rule: The MAD-Median Rule

R Function out

The Boxplot

R Function boxplot

Modifications of the Boxplot Rule for Detecting Outliers

R Function outbox

Other Measures of Location

R Functions mom and onestep

HISTOGRAMS

R Functions hist and splot

KERNEL DENSITY ESTIMATORS

R Functions kdplot and akerd

STEM-AND-LEAF DISPLAYS

R Function stem

SKEWNESS

Transforming Data

CHOOSING A MEASURE OF LOCATION

EXERCISES

PROBABILITY AND RELATED CONCEPTS

BASIC PROBABILITY

EXPECTED VALUES

CONDITIONAL PROBABILITY AND INDEPENDENCE

POPULATION VARIANCE

THE BINOMIAL PROBABILITY FUNCTION

R Functions dbinom and pbinom

CONTINUOUS VARIABLES AND THE NORMAL CURVE

Computing Probabilities Associated with Normal Curves

R Function pnorm

R Function pnorm

R Function pnorm

UNDERSTANDING THE EFFECTS OF NON-NORMALITY

Skewness

PEARSON’S CORRELATION AND THE POPULATION COVARIANCE (OPTIONAL)

Computing the Population Covariance and Pearson’s Correlation

SOME RULES ABOUT EXPECTED VALUES

CHI-SQUARED DISTRIBUTIONS

EXERCISES

SAMPLING DISTRIBUTIONS AND CONFIDENCE INTERVALS

RANDOM SAMPLING

SAMPLING DISTRIBUTIONS

Sampling Distribution of the Sample Mean

Computing Probabilities Associated with the Sample Mean

A CONFIDENCE INTERVAL FOR THE POPULATION MEAN

Known Variance

Confidence Intervals When *_ *Is Not Known

R Functions pt and qt

Confidence Interval for the Population Mean Using Student’s t

R Function t.test

JUDGING LOCATION ESTIMATORS BASED ON THEIR SAMPLING DISTRIBUTION

Trimming and Accuracy: Another Perspective

AN APPROACH TO NON-NORMALITY: THE CENTRAL LIMIT THEOREM

STUDENT’S T AND NON-NORMALITY

CONFIDENCE INTERVALS FOR THE TRIMMED MEAN

Estimating the Standard Error of a Trimmed Mean

Function trimse

A Confidence Interval for the Population Trimmed Mean

R Function trimci

TRANSFORMING DATA

CONFIDENCE INTERVAL FOR THE POPULATION MEDIAN

R Function sint

Estimating the Standard Error of the Sample Median

R Function msmedse

More Concerns About Tied Values

A REMARK ABOUT MOM AND M-ESTIMATORS

CONFIDENCE INTERVALS FOR THE PROBABILITY OF SUCCESS

R Functions binomci, acbinomci and and binomLCO

BAYESIAN METHODS

EXERCISES

HYPOTHESIS TESTING

THE BASICS OF HYPOTHESIS TESTING

P-Value or Significance Level

Criticisms of Two-Sided Hypothesis Testing and P-Values

Summary and Generalization

POWER AND TYPE II ERRORS

Understanding How n, *_*, and *_ *Are Related to Power

TESTING HYPOTHESES ABOUT THE MEAN WHEN *_ *IS NOT KNOWN

R Function t.test

CONTROLLING POWER AND DETERMINING THE SAMPLE SIZE

Choosing *n *Prior to Collecting Data

R Function power.t.test

Stein’s Method: Judging the Sample Size When Data Are Available

R Functions stein1 and stein2

PRACTICAL PROBLEMS WITH STUDENT’S T TEST

HYPOTHESIS TESTING BASED ON A TRIMMED MEAN

R Function trimci

R Functions stein1.tr and stein2.tr

TESTING HYPOTHESES ABOUT THE POPULATION MEDIAN

R Function sintv2

MAKING DECISIONS ABOUT WHICH MEASURE OF LOCATION TO USE

BOOTSTRAP METHODS

BOOTSTRAP-T METHOD

Symmetric Confidence Intervals

Exact Nonparametric Confidence Intervals for Means Are Impossible

THE PERCENTILE BOOTSTRAP METHOD

INFERENCES ABOUT ROBUST MEASURES OF LOCATION

Using the Percentile Method

R Functions onesampb, momci and trimpb

The Bootstrap-t Method Based on Trimmed Means

R Function trimcibt

ESTIMATING POWER WHEN TESTING HYPOTHESES ABOUT A TRIMMED

MEAN

R Functions powt1est and powt1an

A BOOTSTRAP ESTIMATE OF STANDARD ERRORS

R Function bootse

EXERCISES

REGRESSION AND CORRELATION

THE LEAST SQUARES PRINCIPLE

CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Classic Inferential Techniques

Multiple Regression

R Functions ols and lm

STANDARDIZED REGRESSION

PRACTICAL CONCERNS ABOUT LEAST SQUARES REGRESSION AND

HOW THEY MIGHT BE ADDRESSED

The Effect of Outliers on Least Squares Regression

Beware of Bad Leverage Points

Beware of Discarding Outliers Among the *Y *Values

Do Not Assume Homoscedasticity or that the Regression Line is

Straight

Violating Assumptions When Testing Hypotheses

Dealing with Heteroscedasticity: The HC4 Method

R Functions olshc4 and hc4test

Interval Estimation of the Mean Response

R Function olshc4band

PEARSON’S CORRELATION AND THE COEFFICIENT OF DETERMINATION

A Closer Look at Interpreting *r*

TESTING *H*0: *_ *= 0

R Function cor.test

R Function pwr.r.test

Testing *H*0: *_ *= 0 When There is Heteroscedasticity

R Function pcorhc4

When Is It Safe to Conclude that Two Variables Are Independent?

A REGRESSION METHOD FOR ESTIMATING THE MEDIAN OF *Y *AND

OTHER QUANTILES

R Function rqfit

DETECTING HETEROSCEDASTICITY

R Function khomreg

INFERENCES ABOUT PEARSON’S CORRELATION: DEALING WITH HETEROSCEDASTICITY

R Function pcorb

BOOTSTRAP METHODS FOR LEAST SQUARES REGRESSION

R Functions hc4wtest, olswbtest and lsfitci

DETECTING ASSOCIATIONS EVEN WHEN THERE IS CURVATURE

R Functions indt and medind

QUANTILE REGRESSION

R Functions qregci and rqtest

A Test for Homoscedasticity Using a Quantile Regression Approach

R Function qhomt

REGRESSION: WHICH PREDICTORS ARE BEST?

The 0.632 Bootstrap Method

R function regpre

Least Angle Regression

R Function larsR

COMPARING CORRELATIONS

R Functions TWOpov and TWOpNOV

CONCLUDING REMARKS

EXERCISES

COMPARING TWO INDEPENDENT GROUPS

STUDENT’S T TEST

Choosing the Sample Sizes

R Function power.t.test

RELATIVE MERITS OF STUDENT’S T

WELCH’S HETEROSCEDASTIC METHOD FOR MEANS

R function t.test

Tukey’s Three-Decision Rule

Non-normality and Welch’s Method

Three Modern Insights Regarding Methods for Comparing Means

METHODS FOR COMPARING MEDIANS AND TRIMMED MEANS

Yuen’s Method for Trimmed Means

R Functions yuen and fac2list

Comparing Medians

R Function msmed

PERCENTILE BOOTSTRAP METHODS FOR COMPARING MEASURES OF

LOCATION

Using Other Measures of Location

Comparing Medians

R Function medpb2

Some Guidelines on When To Use the Percentile Bootstrap Method

R Functions trimpb2, med2g and pb2gen

BOOTSTRAP-T METHODS FOR COMPARING MEASURES OF LOCATION

Comparing Means

Bootstrap-t Method When Comparing Trimmed Means

R Functions yuenbt and yhbt

Estimating Power and Judging the Sample Sizes

R Functions powest and pow2an

PERMUTATION TESTS

RANK-BASED AND NONPARAMETRIC METHODS

Wilcoxon-Mann-Whitney Test

Handling Tied Values and Heteroscedasticity

Cliff’s Method

R functions cid and cidv2

The Brunner–Munzel Method

R function bmp

The Kolmogorov–Smirnov Test

R Function ks

Comparing All Quantiles Simultaneously: An Extension of the

Kolmogorov–Smirnov Test

R Function sband

GRAPHICAL METHODS FOR COMPARING GROUPS

Error Bars

R Functions ebarplot and ebarplot.med

Plotting the Shift Function

Plotting the Distributions

R Function sumplot2g

Other Approaches

COMPARING MEASURES OF VARIATION

R Function comvar2

Brown-Forsythe Method

Comparing Robust Measures of Variation

MEASURING EFFECT SIZE

R Functions yuenv2 and akp.effect

COMPARING CORRELATIONS AND REGRESSION SLOPES

R Functions twopcor, twolsreg, and tworegwb

COMPARING TWO BINOMIALS

Storer–Kim Method

Beal’s Method

R Functions twobinom, twobici, bi2KMSv2 and power.prop.test

Comparing Two Discrete Distributions

R Function disc2com

MAKING DECISIONS ABOUT WHICH METHOD TO USE

EXERCISES

COMPARING TWO DEPENDENT GROUPS

THE PAIRED T TEST

When Does the Paired T Test Perform Well?

R Function t.test

COMPARING ROBUST MEASURES OF LOCATION

R Functions yuend, ydbt and dmedpb

Comparing Marginal M-Estimators

R Function rmmest

Measuring Effect Size

R Function D.akp.effect

HANDLING MISSING VALUES

R Functions rm2miss and rmmismcp

A DIFFERENT PERSPECTIVE WHEN USING ROBUST MEASURES OF LOCATION

R Functions loc2dif and l2drmci

THE SIGN TEST

WILCOXON SIGNED RANK TEST

R Function wilcox.test

COMPARING VARIANCES

R Function comdvar

COMPARING ROBUST MEASURES OF SCALE

R Function rmrvar

COMPARING ALL QUANTILES

R Functions lband

PLOTS FOR DEPENDENT GROUPS

R Function g2plotdifxy

EXERCISES

ONE-WAY ANOVA

ANALYSIS OF VARIANCE FOR INDEPENDENT GROUPS

A Conceptual Overview 345

ANOVA via Least Squares Regression and Dummy Coding

R Functions anova, anova1, aov, and fac2list

Controlling Power and Choosing the Sample Sizes

R Functions power.anova.test and anova.power

DEALING WITH UNEQUAL VARIANCES 356

Welch’s Test

JUDGING SAMPLE SIZES AND CONTROLLING POWER WHEN DATA ARE

AVAILABLE

R Functions bdanova1 and bdanova2

TRIMMED MEANS

R Functions t1way, t1wayv2, t1wayF and g5plot

Comparing Groups Based on Medians

R Function med1way

BOOTSTRAP METHODS

A Bootstrap-t Method

R Functions t1waybt and BFBANOVA

Two Percentile Bootstrap Methods

R Functions b1way, pbadepth and Qanova

Choosing a Method

RANDOM EFFECTS MODEL

A Measure of Effect Size

A Heteroscedastic Method

A Method Based on Trimmed Means

R Function rananova

RANK-BASED METHODS

The Kruskall-Wallis Test

R Function kruskal.test

Method BDM

R Functions bdm and bdmP

EXERCISES

TWO-WAY AND THREE-WAY DESIGNS

BASICS OF A TWO-WAY ANOVA DESIGN

Interactions

R Functions interaction.plot and interplot

Interactions When There Are More Than Two Levels

TESTING HYPOTHESES ABOUT MAIN EFFECTS AND INTERACTIONS

R function anova

Inferences About Disordinal Interactions

The Two-Way ANOVA Model

HETEROSCEDASTIC METHODS FOR TRIMMED MEANS, INCLUDING

MEANS

R Function t2way

BOOTSTRAP METHODS

R Functions pbad2way and t2waybt

TESTING HYPOTHESES BASED ON MEDIANS

R Function m2way

A RANK-BASED METHOD FOR A TWO-WAY DESIGN

R Function bdm2way

The Patel–Hoel Approach to Interactions

THREE-WAY ANOVA

R Functions anova and t3way

EXERCISES

COMPARING MORE THAN TWO DEPENDENT GROUPS

COMPARING MEANS IN A ONE-WAY DESIGN

R Function aov

COMPARING TRIMMED MEANS WHEN DEALING WITH A ONE-WAY DESIGN

R Functions rmanova and rmdat2mat

A Bootstrap-t Method for Trimmed Means

R Function rmanovab

PERCENTILE BOOTSTRAP METHODS FOR A ONE-WAY DESIGN

Method Based on Marginal Measures of Location

R Function bd1way

Inferences Based on Difference Scores

R Function rmdzero

RANK-BASED METHODS FOR A ONE-WAY DESIGN

Friedman’s Test

R Function friedman.test

Method BPRM

R Function bprm

COMMENTS ON WHICH METHOD TO USE

BETWEEN-BY-WITHIN DESIGNS

Method for Trimmed Means

R Function bwtrim and bw2list

A Bootstrap-t Method

R Function tsplitbt

Inferences Based on M-estimators and Other Robust Measures of

Location

R Functions sppba, sppbb, and sppbi

A Rank-Based Test

R Function bwrank

WITHIN-BY-WITHIN DESIGN

R Function wwtrim

THREE-WAY DESIGNS

R Functions bbwtrim, bwwtrim and wwwtrim

Data Management: R Functions bw2list and bbw2list

EXERCISES

MULTIPLE COMPARISONS

ONE-WAY ANOVA AND RELATED SITUATIONS, INDEPENDENT GROUPS

Fisher’s Least Significant Difference Method

The Tukey-Kramer Method

R Function TukeyHSD

Tukey-Kramer and the ANOVA F Test

Step-Down Methods

Dunnett’s T3

Games-Howell Method

Comparing Trimmed Means

R Functions lincon, stepmcp and twoKlin

Alternative Methods for Controlling FWE

Percentile Bootstrap Methods for Comparing Trimmed Means, Medians,

and M-estimators

R Functions medpb, tmcppb, pbmcp and p.adjust

A Bootstrap-t Method

R Function linconbt

Rank-Based Methods

R Functions cidmul, cidmulv2, and bmpmul

Comparing the Individual Probabilities of Two Discrete Distributions

R Functions binband, splotg2, cumrelf and cumrelfT

Comparing the Quantliles of Two Independent Groups

R Functions qcomhd and qcomhdMC

Multiple Comparisons for Binomial and Categorical Data

R Functions skmcp and discmcp

TWO-WAY, BETWEEN-BY-BETWEEN DESIGN

Scheff´e’s Homoscedastic Method

Heteroscedastic Methods

Extension of Welch-˘Sid´ak and Kaiser–Bowden Methods to Trimmed

Means

R Function kbcon

R Functions con2way and conCON

Linear Contrasts Based on Medians

R Functions msmed and mcp2med

Bootstrap Methods

R Functions mcp2a, and bbmcppb

The Patel-Hoel Rank-Based Interaction Method

R Function rimul

JUDGING SAMPLE SIZES

Tamhane’s Procedure

R Function tamhane

Hochberg’s Procedure

R Function hochberg

METHODS FOR DEPENDENT GROUPS

Linear Contrasts Based on Trimmed Means

R Function rmmcp

Comparing M-estimators

R Functions rmmcppb, dmedpb, dtrimpb and boxdif

Bootstrap-t Method

R Function bptd

Comparing the Quantiles of the Marginal Distributions

R Function Dqcomhd

BETWEEN-BY-WITHIN DESIGNS

R Functions bwmcp, bwamcp, bwbmcp, bwimcp, spmcpa, spmcpb,

spmcpi, and bwmcppb

WITHIN-BY-WITHIN DESIGNS

Three-Way Designs

R Functions con3way, mcp3atm, and rm3mcp

Bootstrap Methods for Three-Way Designs

R Functions bbwmcp, bwwmcp, bwwmcppb, bbbmcppb, bbwmcppb,

bwwmcppb, and wwwmcppb

EXERCISES

SOME MULTIVARIATE METHODS

LOCATION, SCATTER, AND DETECTING OUTLIERS

Detecting Outliers Via Robust Measures of Location and Scatter

R Functions cov.mve and cov.mcd

More Measures of Location and Covariance

R Functions rmba, tbs, and ogk

R Function out

A Projection-Type Outlier Detection Method

R Functions outpro, outproMC, outproad, outproadMC, and out3d

Skipped Estimators of Location

R Function smean

ONE-SAMPLE HYPOTHESIS TESTING

Comparing Dependent Groups

R Functions smeancrv2, hotel1, and rmdzeroOP

TWO-SAMPLE CASE

R Functions smean2, mat2grp, matsplit and mat2list

R functions matsplit, mat2grp and mat2list

MANOVA

R Function manova

Robust MANOVA Based on Trimmed Means

R Functions MULtr.anova and MULAOVp

A MULTIVARIATE EXTENSION OF THE WILCOXON–MANN–WHITNEY

TEST

Explanatory Measure of Effect Size: A Projection-Type Generalization

R Function mulwmwv2

RANK-BASED MULTIVARIATE METHODS

The Munzel–Brunner Method

R Function mulrank

The Choi–Marden Multivariate Rank Test

R Function cmanova

MULTIVARIATE REGRESSION

Multivariate Regression Using R

Robust Multivariate Regression

R Function mlrreg and mopreg

PRINCIPAL COMPONENTS

R Functions prcomp and regpca

Robust Principal Components 545

R Functions outpca, robpca, robpcaS, Ppca, and Ppca.summary

EXERCISES

ROBUST REGRESSION AND MEASURES OF ASSOCIATION

ROBUST REGRESSION ESTIMATORS

The Theil–Sen Estimator

R Functions tsreg, tshdreg and regplot

Least Median of Squares

Least Trimmed Squares and Least Trimmed Absolute Value Estimators

R Functions lmsreg, ltsreg, and ltareg

M-Estimators

R Function chreg

Deepest Regression Line

R Function mdepreg

Skipped Estimators

R Functions opreg and opregMC

S-estimators and an E-Type Estimator

R Function tstsreg

COMMENTS ON CHOOSING A REGRESSION ESTIMATOR

INFERENCES BASED ON ROBUST REGRESSION ESTIMATORS

Testing Hypotheses About the Slopes

Inferences About the Typical Value of *Y *Given *X*

R Functions regtest, regtestMC, regci, regciMC, regYci and regYband

Comparing Measures of Location via Dummy Coding

DEALING WITH CURVATURE: SMOOTHERS

Cleveland’s Smoother

R Functions lowess, lplot, lplot.pred and lplotCI

Smoothers Based on Robust Measures of Location

R Functions rplot, rplotCIS, rplotCI, rplotCIv2, rplotCIM, rplot.pred,

qhdsm and qhdsm.pred

Prediction When *X *Is Discrete: The R Function rundis

Seeing Curvature with More than Two Predictors

R Function prplot

Some Alternative Methods

Detecting Heteroscedasticity Using a Smoother

R Function rhom

SOME ROBUST CORRELATIONS AND TESTS OF INDEPENDENCE

Kendall’s tau

Spearman’s rho

Winsorized Correlation

R Function wincor

OP or Skipped Correlation

R Function scor

Inferences about Robust Correlations: Dealing with Heteroscedasticity

R Functions corb and scorci

MEASURING THE STRENGTH OF AN ASSOCIATION BASED ON A ROBUST

FIT

COMPARING THE SLOPES OF TWO INDEPENDENT GROUPS

R Function reg2ci

TESTS FOR LINEARITY

R Functions lintest, lintestMC, and linchk

IDENTIFYING THE BEST PREDICTORS

Inferences Based on Independent Variables Taken in Isolation

R Functions regpord, ts2str, and sm2strv7 585

Inferences When Independent Variables Ares Taken Together

R Function regIVcom

INTERACTIONS AND MODERATOR ANALYSES

R Functions olshc4.inter, ols.plot.inter, regci.inter, reg.plot.inter and

adtest

Graphical Methods for Assessing Interactions

R Functions kercon, runsm2g, regi

1ANCOVA

Classic ANCOVA

Robust ANCOVA Methods Based on a Parametric Regression Model

R Functions ancJN, ancJNmp, anclin, reg2plot and reg2g.p2plot

ANCOVA Based on the running-interval Smoother

R Functions ancsm, Qancsm, ancova, ancovaWMW, ancpb, ancovaUB,

ancboot, ancdet, runmean2g, qhdsm2g and l2plot

R Functions Dancts, Dancols, Dancova, Dancovapb, DancovaUB and

Dancdet

EXERCISES

BASIC METHODS FOR ANALYZING CATEGORICAL DATA

GOODNESS OF FIT

R Functions chisq.test and pwr.chisq.test

TEST OF INDEPENDENCE

R Function chi.test.ind

DETECTING DIFFERENCES IN THE MARGINAL PROBABILITIES

R Functions contab and mcnemar.test

MEASURES OF ASSOCIATION

The Proportion of Agreement

Kappa

Weighted Kappa

R Function Ckappa

LOGISTIC REGRESSION

R Functions glm and logreg

A Confidence Interval for the Odds Ratio

R Function ODDSR.CI

Smoothers for Logistic Regression

R Functions logrsm, rplot.bin, and logSM

EXERCISES

Appendix A _ ANSWERS TO SELECTED EXERCISES

Appendix B _ TABLES

Appendix C _ BASIC MATRIX ALGEBRA

Index

## Author(s)

### Biography

Rand Wilcox has been a Professor of Psychology at the University of Southern California since 1987. He received his Ph.D. from the University of California, Santa Barbara in 1976. His research interests are statistical methods, particularly robust methods for comparing groups and studying associations. He also collaborates with researchers in occupational therapy, gerontology, biology and psychology. He is the author of four books.