Large biological data, which are often noisy and high-dimensional, have become increasingly prevalent in biology and medicine. There is a real need for good training in statistics, from data exploration through to analysis and interpretation. This book provides an overview of statistical and dimension reduction methods for high-throughput biological data, with a specific focus on data integration. It starts with some biological background, key concepts underlying the multivariate methods, and then covers an array of methods implemented using the mixOmics package in R.
- Provides a broad and accessible overview of methods for multi-omics data integration
- Covers a wide range of multivariate methods, each designed to answer specific biological questions
- Includes comprehensive visualisation techniques to aid in data interpretation
- Includes many worked examples and case studies using real data
- Includes reproducible R code for each multivariate method, using the mixOmics package
The book is suitable for researchers from a wide range of scientific disciplines wishing to apply these methods to obtain new and deeper insights into biological mechanisms and biomedical problems. The suite of tools introduced in this book will enable students and scientists to work at the interface between, and provide critical collaborative expertise to, biologists, bioinformaticians, statisticians and clinicians.
I Modern biology and multivariate analysis
1. Multi-omics and biological systems
2. The cycle of analysis
3. Key multivariate concepts and dimension reduction in mixOmics
4. Choose the right method for the right question in mixOmics
II mixOmics under the hood
5. Projection to Latent Structures
6. Visualisation for data integration
7. Performance assessment in multivariate analyses
III mixOmics in action
8. mixOmics: get started
9. Principal Component Analysis (PCA)
10. 10 Projection to Latent Structure (PLS)
11. Canonical Correlation Analysis (CCA)
12. PLS - Discriminant Analysis (PLS-DA)
13. N − data integration
14. P − data integration
15. Glossary of Terms
"This book was eagerly awaited both to bring together numerous research works published in recent years and to support the use of the Mixomics software which has become an essential tool for data integration and exploration when dealing with multiple types of high-dimensional biological data. It is the result of many years of research on cutting-edge developments in this domain as for sparsity. The book is very pleasant to read and well-structured around the different multivariate approaches. It is well documented with many recent references on the statistical methods and is very didactic through numerous examples accompanied by R codes and illustrations. It can be used by a large audience of statisticians and biologists to process, analyze, visualize, and interpret their multivariate microbiome and multi-omics data, but also as a basis for a course. I highly recommend this book."
- Philippe Bastien, Senior Research Associate - L'Oréal R&I
"The book belongs to the Computational Biology Series and presents a wide spectrum of modern methods of multivariate statistical analysis, integration and high-dimension reduction for biological data evaluated via the specialized R package. The neologism Omic is used as a root related to constellations of objects with biological information, for instance, in genomes and proteins—genomics and proteomics (in studying proteins expressed by cells and tissues), metabolic and transcription products—metabolomics and transcriptomics (in studying messenger RNA molecules expressed from the gens of an organism), or also in economics—Reaganomics, etc.
[. . . ] Numerous links to the internet websites related to the considered methods of multi-omics data integration are suggested, particularly, the mixOmics project is described at the link http://www.mixOmics.org, and the package is available at Install |mixOmics. The developed methods and software are suitable not only for biologists and bioinformaticians students and researchers, but can be useful for solving computational and content problems in many other fields as well."
"This is an excellent book for computational biologists, bioinformaticians, statisticians, data scientists, and graduate students who work with high-throughput omics data. The book covers most fundamental concepts of multi-omics data integration, while focusing on their implementations through hands-on examples implemented in the mixOmics R package."
- Yuehua Cui, Michigan State University, Biometrics, September 2022