1st Edition

# Regression and Machine Learning for Education Sciences Using R

This book provides a conceptual introduction to regression and machine learning and its applications in education research. The book discusses its diverse applications, including its role in predicting future events based on the current data or explaining why some phenomena occur. These identified important predictors provide data-based evidence for educational and psychological decision-making.

Offering an applications-oriented approach while mapping out fundamental methodological developments, this book lays a sound foundation for understanding essential regression and machine learning concepts for data analytics. The first part of the book discusses regression analysis and provides a sturdy foundation to understand the logic of machine learning. With each chapter, the discussion and development of each statistical concept and data analytical technique are presented from an applied perspective, with the statistical results providing insights into decisions and solutions to problems using R. Based on practical examples, and written in a concise and accessible style, the book is learner-centric and does a remarkable job in breaking down complex concepts.

*Regression and Machine Learning for Education Sciences Using R* is primarily for students or practitioners in education and psychology, although individuals from other related disciplines can also find the book beneficial. The dataset and examples used in the book will be from the educational setting, and students will find that this text provides good preparation for studying more statistical and data analytical materials.

*A brief introduction to R and R Studio*

**Part 1: Regression models: Foundation of machine learning**

Chapter 01: First thing first: Simple regression

1.1. Introduction

1.2. An example

1.3. What is the regression model

1.4. How to interpret the regression model

1.5. What is the sum of squares and r2 in the regression model

1.6. What are the predicted values and the residuals?

1.7. How to estimate regression line and what method is used?

1.8. Inference about regression coefficients

1.9. Regression with categorical independent variable

1.10. Summary

Hands-on practice

Chapter 02: Beyond simple: Multiple regression analysis

2.1. Introduction

2.2. An example

2.3. What is a multiple regression model

2.4. How to interpret the results from multiple regression analysis

2.5. Assessing the importance of multiple independent variables

2.6. Recap on categorical independent variables

2.7. How the multiple regression model is estimated

2.8. Summary

Hands-on practice

Chapter 03: It takes two to tangle: Regression with interaction

3.1. Introduction

3.2. An example

3.3. The difference between regression model with and without interaction

3.4. The meaning of βi associated with an interaction term

3.5. Interpretation of interaction

3.6. Summary

Hands-on practice

Chapter 04: Are we thinking correctly: Checking assumptions of regression model

4.1. Introduction

4.2. What are the assumptions of the regression model

4.3. How to check the assumptions

4.4. Summary

Hands-on practice

Chapter 05: I am not straight but robust: Curvilinear Robust and Quantile regression

5.1. An example

5.2. What is curvilinear regression?

5.3. Piecewise regression

5.4. Robust regression

5.5. Quantile regression

5.6. Summary

Hands-on practice

Chapter 06: Predicting the class probability: Logistic regression

6.1. An example

6.2. What is logistic regression

6.3. Interpreting the results from the logistic regression

6.4. The logistic regression model with interaction

6.5. Multinomial logistic regression

6.6. Assumptions of the logistic regression model

6.7. Summary

Hands-on practice

**Part 2: Machine learning: Classification and predictive modeling**

Chapter 07: Introduction to machine learning

7.1. Big data, data science, and data mining

7.2. What is machine learning

7.3. Data preprocessing: A critical step in machine learning

7.4. Machine learning algorithms

7.5. Data splitting for validation

7.6. Summary

Chapter 08. Machine learning algorithms and process

8.1. Introduction to caret package

8.2. Steps in performing machine learning

8.2.1 Detailed discussion of each step of machine learning

8.3. Summary

Chapter 09. Let me regulate: Regularized Machine learning

9.1. Data preprocessing

9.2. Linear regression using machine learning

9.3 Lass, ridge, and elastic net regression models

9.4. Multivariate adaptive regression spline

9.5. Regression tree

9.6. Summary

Hands-on practice

Chapter 10. Finding ways in the forest: Prediction with Random Forest

10.1. Random forest

10.2 Basic principles

10.3 Randomization

10.4. Single tree with CART

10.5 Bagging

10.6 Tuning parameters

10.7. Variable importance

10.8. Example

10.9. Adaptive boosting (AdaBoost) with decision trees

10.10. Gradient boosting with decision trees

10.11. Summary

Hands-on practice

Chapter 11. I can divide better: Classification with support vector machine

11.1. What is Support Vector Machine

11.2. Tuning parameters

11.3. Multiclass classification

11.4 Estimated class probabilities

11.5. Other classification methods

11.6. An example of SVM classification

11.7. An example of SVM regression

11.8. Summary

Hands-on practice

Chapter 12. Work like a human brain: Artificial neural network

12.1. What are artificial neural networks?

12.2. Types of artificial neural networks

12.3. Single-layer feedforward neural network

12.4. Multilayer feedforward neural networks

12.5. Recurrent neural networks

12.6 An Example

12.7. Summary

Hands-on practice

Chapter 13. Desire to find causal relations: Bayesian network

13.1. Bayesian network and causal discovery

13.2. Construction of Bayesian network

13.3. Example

13.4. Summary

Hands-on practice

Chapter 14. We want to see the relationships: Multivariate data visualization

14.1. Commonly used data visualization methods

14.2. Multidimensional scaling visual method for classification

14.3. Example

14.4. Summary

Hands-on practice

### Biography

**Cody Dingsen** is a professor in the Department of Educational Sciences & Professional Programs at the University of Missouri-St. Louis. His research interests include Multidimensional Scaling models for change and preference, psychometrics, data science, cognition and learning, emotional development, and biopsychosocial development.