In biological research, the amount of data available to researchers has increased so much over recent years, it is becoming increasingly difficult to understand the current state of the art without some experience and understanding of data analytics and bioinformatics. An Introduction to Bioinformatics with R: A Practical Guide for Biologists leads the reader through the basics of computational analysis of data encountered in modern biological research. With no previous experience with statistics or programming required, readers will develop the ability to plan suitable analyses of biological datasets, and to use the R programming environment to perform these analyses. This is achieved through a series of case studies using R to answer research questions using molecular biology datasets. Broadly applicable statistical methods are explained, including linear and rank-based correlation, distance metrics and hierarchical clustering, hypothesis testing using linear regression, proportional hazards regression for survival data, and principal component analysis. These methods are then applied as appropriate throughout the case studies, illustrating how they can be used to answer research questions.
· Provides a practical course in computational data analysis suitable for students or researchers with no previous exposure to computer programming.
· Describes in detail the theoretical basis for statistical analysis techniques used throughout the textbook, from basic principles
· Presents walk-throughs of data analysis tasks using R and example datasets. All R commands are presented and explained in order to enable the reader to carry out these tasks themselves.
· Uses outputs from a large range of molecular biology platforms including DNA methylation and genotyping microarrays; RNA-seq, genome sequencing, ChIP-seq and bisulphite sequencing; and high-throughput phenotypic screens.
· Gives worked-out examples geared towards problems encountered in cancer research, which can also be applied across many areas of molecular biology and medical research.
This book has been developed over years of training biological scientists and clinicians to analyse the large datasets available in their cancer research projects. It is appropriate for use as a textbook or as a practical book for biological scientists looking to gain bioinformatics skills.
Table of Contents
Introduction to R
An Introduction to LINUX for Biological Research
Statistical Methods for Data Analysis
Analyzing Generic Tabular Numeric Datasets in R
Functional Enrichment Analysis
Integrating Multiple Datasets in R
Analyzing Microarray Data in R
Analyzing DNA Methylation Microarray Data in R
DNA Analysis With Microarrays
Working with Sequencing Data
Genomic Sequence Profiling
Ed Curry initially studied computer science (Cambridge) and AI with a systems biology specialism (Edinburgh) before embarking on a PhD in computer-based molecular biology, studying stem cell differentiation at the Centre for Regenerative Medicine in Edinburgh. He spent 10 years in the Faculty of Medicine at Imperial College London, during which time he established a research group focusing on interactions between the genetic, epigenetic and transcriptional state of cancer cells during carcinogenesis and the acquisition of drug resistance. He has extensive teaching experience as a lecturer, examiner and course director, including co-founding Imperial College’s Cancer Informatics MRes program and the Genetics & Genomics module for the BSc in Medical Biosciences. He joined GSK R&D in October 2019, remaining an honorary lecturer at Imperial College.