R for Political Data Science: A Practical Guide is a handbook for political scientists new to R who want to learn the most useful and common ways to interpret and analyze political data. It was written by political scientists, thinking about the many real-world problems faced in their work. The book has 16 chapters and is organized in three sections. The first, on the use of R, is for those users who are learning R or are migrating from another software. The second section, on econometric models, covers OLS, binary and survival models, panel data, and causal inference. The third section is a data science toolbox of some the most useful tools in the discipline: data imputation, fuzzy merge of large datasets, web mining, quantitative text analysis, network analysis, mapping, spatial cluster analysis, and principal component analysis.
- Each chapter has the most up-to-date and simple option available for each task, assuming minimal prerequisites and no previous experience in R
- Makes extensive use of the Tidyverse, the group of packages that has revolutionized the use of R
- Provides a step-by-step guide that you can replicate using your own data
- Includes exercises in every chapter for course use or self-study
- Focuses on practical-based approaches to statistical inference rather than mathematical formulae
- Supplemented by an R package, including all data
As the title suggests, this book is highly applied in nature, and is designed as a toolbox for the reader. It can be used in methods and data science courses, at both the undergraduate and graduate levels. It will be equally useful for a university student pursuing a PhD, political consultants, or a public official, all of whom need to transform their datasets into substantive and easily interpretable conclusions.
Table of Contents
I Introduction to R
1. Basic R
2. Data Management
3. Data Visualization
4. Data Loading
Soledad Araya and Andrés Cruz
5. Linear Models
Inés Fynn and Lihuen Nocetto
6. Case Selection Based on Regressions
Inés Fynn and Lihuen Nocetto
7. Panel Data
8. Logistic Models
9. Survival Models
10. Causal Inference
11. Advanced Political Data Management
Andrés Cruz and Francisco Urdinez
12. Web Mining
13. Quantitaive Text Analysis
15. Principal Component Analysis
Caterina Labrín and Francisco Urdinez
16. Maps and Spatial Data
Andrea Escobar and Gabriel Ortiz
This book is edited by Francisco Urdinez, Assistant Professor at the Institute of Political Science of the Pontifical Catholic University of Chile, and Andrés Cruz, Adjunct Instructor at the same institution. Most of the authors who contributed with chapters to this volume are political scientists affiliated to the Institute of Political Science of the Pontifical Catholic University of Chile, and many are researchers and collaborators of the Millennium Data Foundation Institute, an institution that aims at gathering, cleaning and analyzing public data to support public policy. Andrew Heiss is affiliated to Georgia State University Andrew Young School of Policy Studies and he joined this project contributing with a chapter on causal inference. Above all, all the authors are keen users of R.
"Urdinez and Cruz provide a thorough and pedagogically sound introduction to working with political science data in R, complete with modern R code to reproduce every figure and analysis presented. The breadth of statistics and data science methods presented in the book is impressive. The datasets used in examples are real, contemporary, and engaging, which makes the book accessible to anyone interested in quantitative approaches in political science."
- Mine Çetinkaya-Rundel, University of Edinburgh, Duke University, and RStudio.
"This book is a great resource for students learning methods as well as for researchers migrating to R. The volume introduces a wide range of topics, including foundations of R, conventional statistical models, text analysis, networks, maps, and web mining. And there is more! The examples based on Latin America make the book substantively interesting and enjoyable."
- Aníbal Pérez-Liñán, University of Notre Dame
"As others who lacked the capacity to work in R, I was lagging behind regarding my capacity to produce cutting edge empirical analyses for my research. This textbook and its applied pedagogy and examples, significantly reduced the costs of catching up. I highly recommend it, both as a textbook and as a guideline for anyone interested in learning R on their own."
- Juan Pablo Luna, Pontificia Universidad Católica de Chile
"With its tutorial approach, R for Political Data Science builds readers’ R literacy without assuming any prior experience with the language. By the end, your practical political data science toolkit will be well-stocked, you will be more motivated to take the next step and study the mathematical underpinnings of the methods discussed throughout, and using R professionally will no longer feel like a pipe dream (pun intended!)."
- Santiago Olivella, University of North Carolina – Chapel Hill
"If you have a background in Political Science, this is THE BOOK you need to start your journey into R. Using up-to-date tools, this book guides you step-by-step through the process of translating data analysis into political questions. R for Political Data Science not only covers a wide range of techniques and R packages, but also uses Latin American datasets that make the topics covered interesting for a broader audience."
- Riva Quiroga, co-founder of R-Ladies Santiago and R-Ladies Valparaíso, editor of The Programming Historian and chair of the Latin-R Conference
"The monograph belongs to The R Series, and presents a reference textbook on R language with a semester course on statistics with application to estimations on real political data...Each chapter suggests references on the recent sources, exercises, and links to numerous websites with data, packages and other R facilities. The book is convenient as a textbook for students, and is equally helpful for researcher and practitioners. The main material in the book consists of R codes, that supplies the readers with amazingly useful tools of modeling not only in political but in a wider area of applied social and other sciences, wherever the statistical analysis is required."
- Stan Lipovetsky, Technometrics, April 2021