Multivariate Data Integration Using R
Methods and Applications with the mixOmics Package
- Available for pre-order. Item will ship after November 9, 2021
Large biological data, which are often noisy and high-dimensional, have become increasingly prevalent in biology and medicine. There is a real need for good training in statistics, from data exploration through to analysis and interpretation. This book provides an overview of statistical and dimension reduction methods for high-throughput biological data, with a specific focus on data integration. It starts with some biological background, key concepts underlying the multivariate methods, and then covers an array of methods implemented using the mixOmics package in R.
- Provides a broad and accessible overview of methods for multi-omics data integration
- Covers a wide range of multivariate methods, each designed to answer specific biological questions
- Includes comprehensive visualisation techniques to aid in data interpretation
- Includes many worked examples and case studies using real data
- Includes reproducible R code for each multivariate method, using the mixOmics package
The book is suitable for researchers from a wide range of scientific disciplines wishing to apply these methods to obtain new and deeper insights into biological mechanisms and biomedical problems. The suite of tools introduced in this book will enable students and scientists to work at the interface between, and provide critical collaborative expertise to, biologists, bioinformaticians, statisticians and clinicians.
Table of Contents
I Modern biology and multivariate analysis
1. Multi-omics and biological systems
2. The cycle of analysis
3. Key multivariate concepts and dimension reduction in mixOmics
4. Choose the right method for the right question in mixOmics
II mixOmics under the hood
5. Projection to Latent Structures
6. Visualisation for data integration
7. Performance assessment in multivariate analyses
III mixOmics in action
8. mixOmics: get started
9. Principal Component Analysis (PCA)
10. 10 Projection to Latent Structure (PLS)
11. Canonical Correlation Analysis (CCA)
12. PLS - Discriminant Analysis (PLS-DA)
13. N − data integration
14. P − data integration
15. (APPENDIX) Glossary of Terms
Dr Kim-Anh Lê Cao develops novel methods, software and tools to interpret big biological data and answer research questions efficiently. She is committed to statistical education to instill best analytical practice and has taught numerous statistical workshops for biologists and leads collaborative projects in medicine, fundamental biology or microbiology disciplines. Dr Kim-Anh Lê Cao has a mathematical engineering background and graduated with a PhD in Statistics from the Université de Toulouse, France. She then moved to Australia first as a biostatistician consultant at QFAB Bioinformatics, then as a research group leader at the biomedical University of Queensland Diamantina Institute. She currently is Associate Professor in Statistical Genomics at the University of Melbourne. In 2019, Kim-Anh received the Australian Academy of Science’s Moran Medal for her contributions to Applied Statistics in multidisciplinary collaborations. She has been part of leadership program for women in STEMM, including the international Homeward Bound which culminated in a trip to Antarctica, and Superstars of STEM from Science Technology Australia.
Zoe Welham completed a BSc in molecular biology and during this time developed a keen interest in the analysis of big data. She completed a Masters of Bioinformatics with a focus on the statistical integration of different omics data in bowel cancer. She is currently a PhD candidate at the Kolling Institute in Sydney where she is furthering her research into bowel cancer with a focus on integrating microbiome data with other omics to characterise early bowel polyps. Her research interests include bioinformatics and biostatistics for many areas of biology and disseminating that information to the general public through reader-friendly writing.
"This book was eagerly awaited both to bring together numerous research works published in recent years and to support the use of the Mixomics software which has become an essential tool for data integration and exploration when dealing with multiple types of high-dimensional biological data. It is the result of many years of research on cutting-edge developments in this domain as for sparsity. The book is very pleasant to read and well-structured around the different multivariate approaches. It is well documented with many recent references on the statistical methods and is very didactic through numerous examples accompanied by R codes and illustrations. It can be used by a large audience of statisticians and biologists to process, analyze, visualize, and interpret their multivariate microbiome and multi-omics data, but also as a basis for a course. I highly recommend this book."
- Philippe Bastien, Senior Research Associate - L'Oréal R&I