Introduction to Bio-Ontologies

By Peter N. Robinson, Sebastian Bauer

© 2011 – Chapman and Hall/CRC

517 pages | 89 B/W Illus.

Purchasing Options:
Hardback: 9781439836651
pub: 2011-06-21
US Dollars$87.95

About the Book

Introduction to Bio-Ontologies explores the computational background of ontologies. Emphasizing computational and algorithmic issues surrounding bio-ontologies, this self-contained text helps readers understand ontological algorithms and their applications.

The first part of the book defines ontology and bio-ontologies. It also explains the importance of mathematical logic for understanding concepts of inference in bio-ontologies, discusses the probability and statistics topics necessary for understanding ontology algorithms, and describes ontology languages, including OBO (the preeminent language for bio-ontologies), RDF, RDFS, and OWL.

The second part covers significant bio-ontologies and their applications. The book presents the Gene Ontology; upper-level ontologies, such as the Basic Formal Ontology and the Relation Ontology; and current bio-ontologies, including several anatomy ontologies, Chemical Entities of Biological Interest, Sequence Ontology, Mammalian Phenotype Ontology, and Human Phenotype Ontology.

The third part of the text introduces the major graph-based algorithms for bio-ontologies. The authors discuss how these algorithms are used in overrepresentation analysis, model-based procedures, semantic similarity analysis, and Bayesian networks for molecular biology and biomedical applications.

With a focus on computational reasoning topics, the final part describes the ontology languages of the Semantic Web and their applications for inference. It covers the formal semantics of RDF and RDFS, OWL inference rules, a key inference algorithm, the SPARQL query language, and the state of the art for querying OWL ontologies.

Web Resource

Software and data designed to complement material in the text are available on the book’s website: The site provides the R Robo package developed for the book, along with a compressed archive of data and ontology files used in some of the exercises. It also offers teaching/presentation slides and links to other relevant websites.

This book provides readers with the foundation to use ontologies as a starting point for new bioinformatics research projects or to support current molecular genetics research projects. By supplying a self-contained introduction to OBO ontologies and the Semantic Web, it bridges the gap between both fields and helps readers see what each can contribute to the analysis and understanding of biomedical data.


"This book is one of the first source books in the field; it is well written and coherent. Its introduction gives the reader a good taste of what comes next and it also contains good exercises."

—Mohsen Mahmoudi Aznaveh, ACM SIGACT News, 2013

"This welcome book could have been titled ‘all you wanted to know about bio-ontologies but didn’t dare ask.’ In recent years the biological sciences have generated very large, complex data sets whose management, analysis and sharing have created unprecedented challenges. The development of ontologies, originally driven by the invention of the semantic web, has been critical in handling this data and permitting interoperability between databases and between applications. Many of the bio-ontologies and the computational approaches which use them have now become mature, and an understanding of bio-ontologies has really become a requirement for anyone in the mainstream biomedical sciences.

Introduction to Bio-Ontologies provides a self-contained introduction to ontologies for bioinformaticians, computer scientists and biomedical scientists who need to know about the computational background and implementation of ontologies. The book is designed to support either advanced undergraduate or master’s courses in bioinformatics or computer science but is also a first stop for any investigator who wants to understand ontologies and how to use them.

The four parts of the book cover basic concepts, specific widely used ontologies, such as the Gene Ontology, algorithms and applications of ontologies. The breadth of coverage is impressive for such a compact volume and there is excellent critical discussion of ontologies from a biological as well as a computational point of view. The book succeeds well in its aim of providing a self-contained primer on ontologies and much of the mathematics used is backed up with detailed explanations and technical appendices which introduce and explain the more complex mathematical and logical concepts, such as inference and information content. Practical exercises are provided and these are very valuable for using the book as a teaching tool.

Well written, up to date and accessible this is an excellent addition to the bookshelves of any lab and could be the core text for a course on ontologies."

—Dr. Paul Schofield, Senior Lecturer in Anatomy, Department of Physiology, Development and Neuroscience, University of Cambridge, UK

"This excellent book provides a clear and objective introduction to the subject, and provides an extensive and detailed overview of state-of-the-art research towards a more efficient exploitation of the existing and newly generated biomedical data by using bio-ontologies. As a professor I intend to adopt this book for graduate course-units on bioinformatics and as a researcher I intend to use this book as a way to introduce me to state-of-the-art approaches related to my research interests."

—Francisco M. Couto, University of Lisbon, Portugal

Table of Contents


Ontologies and Applications of Ontologies in Biomedicine

What Is an Ontology?

Ontologies and Bio-Ontologies

Ontologies for Data Organization, Integration, and Searching

Computer Reasoning with Ontologies

Typical Applications of Bio-Ontologies

Mathematical Logic and Inference

Representation and Logic

Propositional Logic

First-Order Logic


Description Logic

Probability Theory and Statistics for Bio-Ontologies

Probability Theory

Bayes’ Theorem

Introduction to Graphs

Bayesian Networks

Ontology Languages



OWL and the Semantic Web


The Gene Ontology

A Tool for the Unification of Biology

Three Subontologies

Relations in GO

GO Annotations

GO Slims

Upper-Level Ontologies

Basic Formal Ontology

The Big Divide: Continuants and Occurrents

Universals and Particulars

Relation Ontology

Revisiting Gene Ontology

Revisiting GO Annotations

A Selective Survey of Bio-Ontologies

OBO Foundry

The National Center for Biomedical Ontology


What Makes a Good Ontology?


Overrepresentation Analysis



Multiple Testing Problem

Term-for-Term Analysis: An Extended Example

Inferred Annotations Lead to Statistical Dependencies in Ontology DAGs

Parent-Child Algorithms

Parent-Child Analysis: An Extended Example

Topology-Based Algorithms

Topology-elim: An Extended Example

Other Approaches


Model-Based Approaches to GO Analysis

A Probabilistic Generative Model for GO Enrichment Analysis

A Bayesian Network Model

MGSA: An Extended Example


Semantic Similarity

Information Content in Ontologies

Semantic Similarity of Genes and Other Items Annotated by Ontology Terms

Statistical Significance of Semantic Similarity Scores

Frequency-Aware Bayesian Network Searches in Attribute Ontologies

Modeling Queries

Probabilistic Inference for the Items

Parameter-Augmented Network

The Frequency-Aware Network



Inference in the Gene Ontology

Inference over GO Edges

Cross-Products and Logical Definitions

RDFS Semantics and Inference



RDF Entailment

RDFS Entailment

Entailment Rules


Inference in OWL Ontologies

The Semantics of Equality

The Semantics of Properties

The Semantics of Classes

The Semantics of the Schema Vocabulary


Algorithmic Foundations of Computational Inference

The Tableau Algorithm

Developer Libraries


SPARQL Queries

Combining RDF Graphs


Appendix A: An Overview of R

Appendix B: Information Content and Entropy

Appendix C: W3C Standards: XML, URIs, and RDF

Appendix D: W3C Standards: OWL



Exercises and Further Reading appear at the end of each chapter.

About the Authors

Peter N. Robinson is a research scientist and leader of the Computational Biology Group in the Institute of Medical Genetics and Human Genetics at Charité-Universitätsmedizin Berlin. Dr. Robinson completed his medical education at the University of Pennsylvania, followed by an internship at Yale University. He also studied mathematics and computer science at Columbia University. His research interests involve the use of mathematical and bioinformatics models to understand biology and hereditary disease.

Sebastian Bauer is a research assistant in the Institute of Medical Genetics and Human Genetics at Charité-Universitätsmedizin Berlin. He earned a degree in computer science from the Technical University of Ilmenau. His research interests include mathematical modeling, discrete algorithms, theoretical computer science, software engineering, and the applications of these fields to medicine and biology.

About the Series

Chapman & Hall/CRC Mathematical and Computational Biology

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Database Management / Data Mining
SCIENCE / Life Sciences / Biology / General
SCIENCE / Biotechnology