This book covers several of the statistical concepts and data analytic skills needed to succeed in data-driven life science research. The authors proceed from relatively basic concepts related to computed p-values to advanced topics related to analyzing highthroughput data. They include the R code that performs this analysis and connect the lines of code to the statistical and mathematical concepts explained.
Table of Contents
Introduction. Getting started. Inference. Exploratory data analysis. Robust summaries. Matrix algebra. Linear models. Inference for high dimensional data. Statistical models. Distance and dimension reduction. Statistical models. Distance and dimension reduction. Basic machine learning. Batch effects.
Rafael A. Irizarry is Professor of Applied Statistics at the Dana Farber Cancer Center and Harvard School of Public Health. In 2009 he was awarded The Presidents' Award by the Committee of Presidents of Statistical Societies (COPSS). His work has been highly cited and his open source software tools widely downloaded.
Michael I. Love is a Postdoctoral Fellow at Harvard School of Public Health. He received his Ph.D. in computational biology in 2013 from the Freie Universität Berlin.
Professors Irizarry and Love have taught seven computational biology courses on edX to hundreds of thousands of students.
"In addition to the presentation of several strategies designed to handle multivariate data, the book’s strength lies in its immediate applicability. By including relevant datasets, the embedding of R code throughout, and in the open source nature of its production (it was written in R markdown), the book has encouraged reproducible research while connecting computer code to the relevant statistical concepts. Practitioners in the life sciences would seemingly be well served to use the book as a guide for their research. . .. The open-source nature of the book is a unique benefit, as it ensures that future versions can swiftly update to include new concepts, data, or coding techniques. . . The book could also function as a textbook, particularly for a course in computational biology (either advanced undergraduate or introductory graduate).
~The American Statistician, Reviews of Books and Teaching Materials
"Overall, I found that this book is excellent for researchers in the life sciences who are interested in retrieving, analyzing, and interpreting complex research data using sophisticated statistical methods and computing. The authors have effectively condensed broad and important topics into a single book. I highly recommend this book to anyone venturing into the exciting world of data analysis in many areas."