Statistical Programming in SAS: 2nd Edition (Paperback) book cover

Statistical Programming in SAS

2nd Edition

By A. John Bailer

Chapman and Hall/CRC

355 pages | 50 B/W Illus.

Purchasing Options:$ = USD
Paperback: 9780367357979
pub: 2019-12-19
SAVE ~$15.99
Available for pre-order. Item will ship after 19th December 2019
$79.95
$63.96
x
Hardback: 9780367358006
pub: 2019-12-19
SAVE ~$39.99
Available for pre-order. Item will ship after 19th December 2019
$199.95
$159.96
x


FREE Standard Shipping!

Description

Statistical Programing in SAS, Second Edition provides a foundation for programming to implement statistical solutions using SAS®, a system that has been used to solve data analytic problems for more than 40 years. Motivating example inspire readers to generate programming solutions and worked case studies illustrate full implementation of such solutions. Upper-level undergraduates, beginning graduate students, and professionals will benefit from this book. The ideal background for a reader is some background in regression modeling and introductory experience with computer programming.

Changes in the second edition include:

  • a new chapter on text processing
  • new sections including debugging and coding efficiency
  • expansion of case studies and examples
  • a major reorganization of the chapter order reflecting input from instructors teaching with this text over the last 10 years.

The coverage of statistical programming in the second edition includes:

  • Getting data into the SAS system, engineering new features, and formatting variables
  • Writing readable, and well-documented code
  • Structuring, implementing and debugging programs
  • Creating solutions to novel problems
  • Combining data sources, extracting parts of data sets, and reshaping data sets as needed for other analyses
  • Generating general solutions using macros
  • Customizing output
  • Producing insight-inspiring data visualizations
  • Parsing, processing, and analyzing text
  • Programming with Matrices
  • Connecting SAS with R
  • Covering topics that are part of both SAS Base and Certification exams

Table of Contents

1. Structuring, implementing, and debugging programs to learn about data

Statistical Programming

Learning from Constructed, Artificial Data

Good Programming Practice

SAS Program Structure

What Is a SAS Data Set?

Internally Documenting SAS Program

Basic Debugging

Getting Help

Exercises

2. Reading, Creating and Formatting Data Sets

What does a SAS Data Step do?

Reading Data from External Files

Reading CSV, Excel and TEXT files

Temporary versus Permanent Status of Data Sets

Formatting and Labeling Variables

User-defined Formatting

Recoding and Transforming Variables in a DATA Step

Writing Out a File or Making a Simple Report

Exercises

3. Programming a DATA step

Writing Programs by subdividing tasks

Ordering How Tasks are Done

Index-able Lists of variables, aka arrays

Functions associated with Statistical Distributions

Generating Variables Using Random Number Generators

Remembering Variable Values across Observations

Processing multiple observations for a single observation

Case Study 1: Is the Two-Sample t-Test Robust to Violations of the Heterogeneous Variance assumption?

Efficiency considerations – how long does it take?

Case Study 2: Monte Carlo Integration to Estimate an Integral

Case Study 3: Simple Percentile-Based Bootstrap

Case Study 4: Randomization Test for the Equality of Two Populations

Exercises

4. Combining, extracting and reshaping data

Adding observations by SET-ing data sets

Adding variables by MERGE-ing data sets

Working with tables in PROC SQL

Converting wide to long formats

Converting long to wide formats

Case Study: Reshaping a World Bank data set

Building training and validation data sets

Exercises

Self-Study lab

5. Macro Programming

What Is a Macro and Why Would You Use It?

Motivation for Macros: Numerical Integration to Determine P(0<Z<1.645)

Processing Macros

Macro Variables, Parameters, and Functions

Conditional Execution, Looping, and Macros

Saving Macros

Functions and Routines for Macros

Case Study: Macro for constructing training and test data set for Model Comparison

Case Study: Processing Multiple Data Sets

Exercises

6. Customizing Output and Generating Data Visualizations

Using the Output Delivery System

Graphics in SAS

ODS Statistical Graphics

Modifying Graphics Using the ODS Graphics Editor

Graphing with Styles and Templates

Statistical Graphics—Entering the Land of SG Procedures

Case Study: Using the SG Procedures

Enhancing SG displays – options with SG procedure statements

Using Annotate Data Sets to enhance SG displays

Using Attribute Maps to enhance SG displays

Exercises

7. Processing Text

Cleaning and Processing Text Data

Starting with Character Functions

Processing Text

Case Study: Sentiment in State of the Union addresses

Case Study: Reading Text from a Web Page

Regular Expressions

Case Study (revisited) – Applying Regular Expressions

Exercises

8. Programming with Matrices and Vectors

Defining a Matrix and Subscripting

Using Diagonal Matrices and Stacking Matrices

Using Elementwise Operations, Repeating, and Multiplying Matrices

Importing a Data Set into SAS/IML and Exporting Matrices from SAS/IML to a Data Set

Case Study 1: Monte Carlo Integration to Estimate π

Case Study 2: Bisection Root Finder

Case Study 3: Randomization Test Using Matrices Imported from PROC PLAN

Case Study 4: SAS/IML Module to Implement Monte Carlo Integration to Estimate π

Storing and loading SAS/IML modules

SAS/IML and R

Exercises

References

About the Author

A. John Bailer, PhD, PStat® is university distinguished professor and founding chair of the Department of Statistics and affiliate member of the Departments of Biology and Sociology and Gerontology as well as the Institute for the Environment and Sustainability at Miami University in Oxford, Ohio. Starting in August 2019, he will be President of the International Statistical Institute. He is a Fellow of the American Statistical Association, the Society for Risk Analysis, and the American Association for the Advancement of Science. His research has focused on quantitative risk estimation but has collaborations addressing problems in toxicology, environmental health, and occupational safety. He received the E. Phillips Knox Distinguished Teaching Award in 2018 after previously receiving the Distinguished Teaching Award for Excellence in Graduate Instruction and Mentoring and the College of Arts and Science Distinguished Teaching Award. He is also the co-founder and continuing panelist on the Stats+Stories podcast.

Subject Categories

BISAC Subject Codes/Headings:
BUS061000
BUSINESS & ECONOMICS / Statistics
MAT029000
MATHEMATICS / Probability & Statistics / General