In computational science, reproducibility requires that researchers make code and data available to others so that the data can be analyzed in a similar manner as in the original publication. Code must be available to be distributed, data must be accessible in a readable format, and a platform must be available for widely distributing the data and code. In addition, both data and code need to be licensed permissively enough so that others can reproduce the work without a substantial legal burden.
Implementing Reproducible Research covers many of the elements necessary for conducting and distributing reproducible research. It explains how to accurately reproduce a scientific result.
Divided into three parts, the book discusses the tools, practices, and dissemination platforms for ensuring reproducibility in computational science. It describes:
- Computational tools, such as Sweave, knitr, VisTrails, Sumatra, CDE, and the Declaratron system
- Open source practices, good programming practices, trends in open science, and the role of cloud computing in reproducible research
- Software and methodological platforms, including open source software packages, RunMyCode platform, and open access journals
Each part presents contributions from leaders who have developed software and other products that have advanced the field. Supplementary material is available at www.ImplementingRR.org.
Table of Contents
knitr: A Comprehensive Tool for Reproducible Research in R Yihui Xie
Reproducibility Using VisTrails Juliana Freire, David Koop, Fernando Chirigati, and Cláudio T. Silva
Sumatra: A Toolkit for Reproducible Research Andrew P. Davison, Michele Mattioni, Dmitry Samarkanov, and Bartosz Teleńczuk
CDE: Automatically Package and Reproduce Computational Experiments Philip J. Guo
Reproducible Physical Science and the Declaratron Peter Murray-Rust and Dave Murray-Rust
Practices and Guidelines
Developing Open-Source Scientific Practice K. Jarrod Millman and Fernando Pérez
Reproducible Bioinformatics Research for Biologists Likit Preeyanon, Alexis Black Pyrkosz, and C. Titus Brown
Reproducible Research for Large-Scale Data Analysis Holger Hoefling and Anthony Rossini
Practicing Open Science Luis Ibanez, William J. Schroeder, and Marcus D. Hanwell
Reproducibility, Virtual Appliances, and Cloud Computing Bill Howe
The Reproducibility Project: A Model of Large-Scale Collaboration for Empirical Research on Reproducibility Open Science Collaboration
What Computational Scientists Need to Know about Intellectual Property Law: A Primer Victoria Stodden
Open Science in Machine Learning Mikio L. Braun and Cheng Soon Ong
RunMyCode.org: A Research-Reproducibility Tool for Computational Sciences Christophe Hurlin, Christophe Pérignon, and Victoria Stodden
Open Science and the Role of Publishers in Reproducible Research Iain Hrynaszkiewicz, Peter Li, and Scott Edmunds
"This collection brings together the expertise and experience of numerous authors and is likely to be valuable to scientists and statisticians alike. … This book should have broad appeal … introduces some extremely useful tools and practices from leaders in the field. On top of that, it also contains an exciting vision for the future of scientific research. … The challenge of reproducibility in the computational era is being confronted across the sciences, with each field developing its own tools and best practices. This book is an important step in bringing together a broad group of scientists to share what has been learned."
—Journal of the American Statistical Association, June 2015
"The book as a whole has something for everybody and provides an interesting snapshot of the available tools, platforms, and good practices for researchers as the scientific community aims to be more self-correcting."
—Journal of Statistical Software, October 2014
"Three recent books have significantly influenced how I use R in reproducible work: Dynamic Documents with R and knitr by Yihui Xie, Reproducible Research with R and RStudio by Christopher Gandrud, and Implementing Reproducible Research edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng … I recommend all three books to R users at any level. There really is something here for everyone."
—Richard Layton, PhD, PE, Rose-Hulman Institute of Technology, Terre Haute, Indiana, USA
"In total, this book provides information on almost all aspects of reproducible research in the open science environment … I would recommend this book to anybody who wants to learn more about reproducible research in the context of open science."