1st Edition

R Companion to Elementary Applied Statistics

By Christopher Hay-Jahans Copyright 2019
    376 Pages 80 B/W Illustrations
    by Chapman & Hall

    376 Pages 80 B/W Illustrations
    by Chapman & Hall

    376 Pages 80 B/W Illustrations
    by Chapman & Hall

    The R Companion to Elementary Applied Statistics includes traditional applications covered in elementary statistics courses as well as some additional methods that address questions that might arise during or after the application of commonly used methods. Beginning with basic tasks and computations with R, readers are then guided through ways to bring data into R, manipulate the data as needed, perform common statistical computations and elementary exploratory data analysis tasks, prepare customized graphics, and take advantage of R for a wide range of methods that find use in many elementary applications of statistics.

    Features:

    • Requires no familiarity with R or programming to begin using this book.

    • Can be used as a resource for a project-based elementary applied statistics course, or for researchers and professionals who wish to delve more deeply into R.
    • Contains an extensive array of examples that illustrate ideas on various ways to use pre-packaged routines, as well as on developing individualized code.
    • Presents quite a few methods that may be considered non-traditional, or advanced.
    • Includes accompanying carefully documented script files that contain code for all examples presented, and more.

    R is a powerful and free product that is gaining popularity across the scientific community in both the professional and academic arenas. Statistical methods discussed in this book are used to introduce the fundamentals of using R functions and provide ideas for developing further skills in writing R code. These ideas are illustrated through an extensive collection of examples.

    About the Author:

    Christopher Hay-Jahans received his Doctor of Arts in mathematics from Idaho State University in 1999. After spending three years at University of South Dakota, he moved to Juneau, Alaska, in 2002 where he has taught a wide range of undergraduate courses at University of Alaska Southeast.

     

    1. Preliminaries
    2. First Steps

      Running Code in R

      Some Terminology

      Hierarchy of Data Classes

      Data Structures

      Operators

      Functions

      R Packages

      Probability Distributions

      Coding Conventions

      Some Book-keeping and Other Tips

      Getting Quick Coding Help

    3. Bringing Data Into and Out of R
    4. Entering Data Through Coding

      Number and Sample Generating Tricks

      The R Data Editor

      Reading Text Files

      Reading Data from Other File Formats

      Reading Data from the Keyboard

      Saving and Exporting Data

    5. Accessing Contents of Data Structures
    6. Extracting Data from Vectors

      Conducting Data Searches in Vectors

      Working with Factors

      Navigating Data Frames

      Lists

      Choosing an Access/Extraction Method

      Additional Notes

      More About the attach Function

      About Functions and their Arguments

      Alternative Argument Assignments in Function Calls

    7. Altering and Manipulating Data
    8. Altering Entries in Vectors

      Transformations

      Manipulating Character Strings

      Sorting Vectors and Factors

      Altering Data Frames

      Sorting Data Frames

      Moving Between Lists and Data Frames

      Additional Notes on the merge Function

    9. Summaries and Statistics
    10. Univariate Frequency Distributions

      Bivariate Frequency Distributions

      Statistics for Univariate Samples

      Measures of Central Tendency

      Measures of Spread

      Measures of Position

      Measures of Shape

      Five-Number Summaries and Outliers

      Elementary Five-Number Summary

      Tukey’s Five-Number

      The boxplotstats Function

    11. More on Computing with R
    12. Computing with Numeric Vectors

      Working with Lists, Data Frames and Arrays

      The sapply Function

      The tapply Function

      The by Function

      The aggregate Function

      The apply Function

      The sweep Function

      For-loops

      Conditional Statements and the switch Function

      The if-then Statement

      The if-then-else Statement

      The switch Function

      Preparing Your Own Functions

    13. Basic Charts for Categorical Data
    14. Preliminary Comments

      Bar Charts

      Dot Charts

      Pie Charts

      Exporting Graphics Images

      Additional Notes

      Customizing Plotting Windows

      The plotnew and plotwindow Functions

      More on the paste Function

      The title Function

      More on the legend Function

      More on the mtext Function

      The text Function

    15. Basic Plots for Numeric Data
    16. Histograms

      Boxplots

      Stripcharts

      QQ-Plots

      Normal Probability QQ-Plots

      Interpreting Normal Probability QQ-Plots

      More on Reference Lines for QQ-Plots

      QQ-Plots for Other Distributions

      Additional Notes

      More on the ifelse Function

      Revisiting the axis Function

      Frequency Polygons and Ogives

    17. Scatterplots, Lines, and Curves
    18. Scatterplots

      Basic Plots

      Manipulating Plotting Characters

      Plotting Transformed Data

      Matrix Scatterplots

      The matplot Function

      Graphs of Lines

      Graphs of Curves

      Superimposing Multiple Lines and/or Curves

      Time-series Plots

    19. More Graphics Tools
    20. Partitioning Graphics Windows

      The layout Function

      The splitscreen Function

      Customizing Plotted Text and Symbols

      Inserting Mathematical Annotation in Plots

      More Low-level Graphics Functions

      The points and symbols Functions

      The grid, segments and arrows Functions

      Boxes, Rectangles and Polygons

      Error Bars

      Computing Bounds for Error Bars

      The errorBarplot Function

      Purpose and Interpretation of Error Bars

      More R Graphics Resources

    21. Tests for One and Two Proportions
    22. Relevant Probability Distributions

      Binomial Distributions

      Hypergeometric Distributions

      Normal Distributions

      Chi-square Distributions

      Single Population Proportions

      Estimating a Population Proportion

      Hypotheses for Single Proportion Tests

      A Normal Approximation Test

      A Chi-square Test

      An Exact Test

      Which Approach Should be Used?

      Two Population Proportions

      Estimating Differences Between Proportions

      Hypotheses for Two Proportions Tests

      A Normal Approximation Test

      A Chi-square Test

      Fisher’s Exact Test

      Which Approach Should be Used?

      Additional Notes

      Normal Approximations of Binomial Distributions

      One- versus Two-sided Hypothesis Tests

    23. Tests for More than Two Proportions
    24. Equality of Three or More Proportions

      Pearson’s Homogeneity of Proportions Test

      Marascuilo’s Large Sample Procedure

      Cohen’s Small Sample Procedure

      Simultaneous Pairwise Comparisons

      Marascuilo’s Large Sample Procedure

      Cohen’s Small Sample Procedure

      Linear Contrasts of Proportions

      Marascuilo’s Large Sample Approach

      Cohen’s Small Sample Approach

      The Chi-square Goodness-of-Fit Test

    25. Tests of Variances and Spread
    26. Relevant Probability Distributions

      F Distributions

      Using a Sample to Assess Normality

      Single Population Variances

      Estimating a Variance

      Testing a Variance

      Exactly Two Population Variances

      Estimating the Ratio of Two Variances

      Testing the Ratio of Two Variances

      What if the Normality Assumption is Violated?

      Two or More Population Variances

      Assessing Spread Graphically

      Levene’s Test

      Levene’s Test with Trimmed Means

      Brown-Forsythe Test

      Fligner-Killeen Test

    27. Tests for One or Two Means
    28. Student’s t-Distribution

      Single Population Means

      Verifying the Normality Assumption

      Estimating a Mean

      Testing a Mean

      Can a Normal Approximation be Used Here?

      Exactly Two Population Means

      Verifying Assumptions

      The Test for Dependent Samples

      Tests for Independent Samples

    29. Tests for More than Two Means
    30. Relevant Probability Distributions

      Studentized Range Distribution

      Dunnett’s Test Distribution

      Studentized Maximum Modulus Distribution

      Setting the Stage

      Equality of Means — Equal Variances Case

      Pairwise Comparisons — Equal Variances

      Bonferroni’s Procedure

      Tukey’s Procedure

      t Tests and Comparisons with a Control

      Dunnett’s Test and Comparisons with a Control

      Which Procedure to Choose

      Equality of Means — Unequal Variances Case

      Large-sample Chi-square Test

      Welch’s F Test

      Hotelling’s T Test

      Pairwise Comparisons — Unequal Variances

      Large-sample Chi-square Test

      Dunnett’s C Procedure

      Dunnett’s T Procedure

      Comparisons with a Control

      Which Procedure to Choose

      The Nature of Differences Found

      All Possible Pairwise Comparisons

      Comparisons with a Control

    31. Selected Tests for Medians, and More
    32. Relevant Probability Distributions

      Distribution of the Signed Rank Statistic

      Distribution of the Rank Sum Statistic

      The One-sample Sign Test

      The Exact Test

      The Normal Approximation

      Paired Samples Sign Test

      Independent Samples Median Test

      Equality of Medians

      Pairwise Comparisons of Medians

      Single Sample Signed Rank Test

      The Exact Test

      The Normal Approximation

      Paired Samples Signed Rank Test

      Rank Sum Test of Medians

      The Exact Mann-Whitney Test

      The Normal Approximation

      The Wilcoxon Rank Sum Test

      Using the Kruskal-Wallis Test to Test Medians

      Working with Ordinal Data

      Paired Samples

      Independent Samples

      More than Two Independent Samples

      Some Comments on the Use of Ordinal Data

    33. Dependence and Independence

    Assessing Bivariate Normality

    Pearson’s Correlation Coefficient

    An Interval Estimate of ρ

    Testing the Significance of ρ

    Testing a Null Hypothesis with ρ ≠

    Kendall’s Correlation Coefficient

    An Interval Estimate of τ

    Exact Test of the Significance of τ

    Approximate Test of the Significance of τ

    Spearman’s Rank Correlation Coefficient

    Exact Test of the Significance of ρS

    Approximate Test of the Significance ρS

    Correlations in General — Comments and Cautions

    Chi-square Test of Independence

    For the Curious — Distributions of rK and rS

    Biography

    Christopher Hay-Jahans received his Doctor of Arts in mathematics from Idaho State University in 1999. After spending three years at University of South Dakota, he moved to Juneau, Alaska, in 2002 where he has taught a wide range of undergraduate courses at University of Alaska Southeast.

    "This book is written by a Professor of Mathematics with much experience in teaching statistics applied to the natural sciences. As mentioned in the Preface, the book addresses students (and teachers) of elementary statistics courses. Only basic preliminary statistical knowledge is necessary to start using the book, it is perfect for anyone jumping in to R, and it could readily serve as a reference manual rather than to be read from beginning to end... Several simple applied examples with detailed explanations are presented (coded in R) in order to make the methods more deeply understandable, and in some cases to compare different types of application (e.g. when different assumptions are filled, different research questions are of interest, or different types of data are recorded). All the richly-commented script files used in the book are available on the publisher’s website... At the end of the book, a highly informative Index aids quick searches. Nevertheless, the book can be ordered as an e-book as well... This second book of Professor Hay-Jahans, particularly together with the first one, is appropriate for undergraduate students as an
    introductory book on statistics using R, but it could successfully be used also by PhD students, researchers, and teachers requiring a consistent and through reference."
    - Márta Ladányi, ISCB December 2019