Chapman and Hall/CRC
264 pages | 50 B/W Illus.
Molecular biologists are performing increasingly large and complicated experiments, but often have little background in data analysis. The book is devoted to teaching the statistical and computational techniques molecular biologists need to analyze their data. It explains the big-picture concepts in data analysis using a wide variety of real-world molecular biological examples such as eQTLs, ortholog identification, motif finding, inference of population structure, protein fold prediction and many more. The book takes a pragmatic approach, focusing on techniques that are based on elegant mathematics yet are the simplest to explain to scientists with little background in computers and statistics.
Introduction. Statistical modeling. Statistics and probability. Multiple testing. Multivariate statistics and parameter estimation. Clustering. Distance-based. Gaussian mixture models. Simple linear regression. Multiple regression and generalized linear models. Regularization. Linear classification. Non-linear classification. Evaluating classifiers and ensemble methods. Correlated data in one dimension. Hidden-Markov models. Local regression.