Data Warehousing for Biomedical Informatics: 1st Edition (Hardback) book cover

Data Warehousing for Biomedical Informatics

1st Edition

By Richard E. Biehl

Auerbach Publications

633 pages | 141 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781482215212
pub: 2015-11-19
SAVE ~$17.39
eBook (VitalSource) : 9780429091001
pub: 2016-01-13
from $43.48

FREE Standard Shipping!


Data Warehousing for Biomedical Informatics is a step-by-step how-to guide for designing and building an enterprise-wide data warehouse across a biomedical or healthcare institution, using a four-iteration lifecycle and standardized design pattern. It enables you to quickly implement a fully-scalable generic data architecture that supports your organization’s clinical, operational, administrative, financial, and research data. By following the guidelines in this book, you will be able to successfully progress through the Alpha, Beta, and Gamma versions, plus fully implement your first production release in about a year.

The Alpha version allows you to implement just enough of the basic design pattern to illustrate its core capabilities while loading a small sampling of limited data for demonstration purposes. This provides an easy way for everyone involved to visualize the new warehouse paradigm by actually examining a core subset of the working system. You can finish the Alpha version, also referred to as the proof-of-concept, in as little as 3-4 weeks.

The Beta version, which can be completed in about 2-3 months, adds required functionality and much more data. It allows you to get the full warehouse up and running quickly, in order to facilitate longer-term planning, user and support team training, and setup of the operational environment. The Gamma version, which is a fully-functional system–though still lacking data–can be implemented in about 3-4 months. About one year after starting, you will be ready to launch Release 1.0 as a complete and secure data warehouse.

Table of Contents

Biomedical Data Warehousing

Nature of Biomedical Data

Nature of Warehoused Data

Business Requirements

Functional Requirements

Never-Finished Warehouse

Organizational Readiness

Implementation Strategy


Dimensional Data Modeling

Evolution of Data Warehouses

The Star Schema

Transposing Dimensional Schema

Anticipating Dimensions

Affinity Analysis

Understanding Source Data

Implicit versus Explicit Data

Semantic Layers

Information Artifacts

Biomedical Context

Clinical Picture

Ontological Levels

Epistemological Levels


Biomedical Warehouse

Biomedical Star

Biomedical Facts

Master Dimensions

Reference Dimensions

Almanac Dimensions

Analysis Dimensions

Control Dimensions

Requirements Alignment

Star Dimension Design Pattern

Structure of a Dimension

Master Data: Definition Tables

Slowly Changing Dimensions

Source Keys: Context and Reference Tables

Fact Participation: Group and Bridge Tables

Interconnections: Hierarchy Tables

Connecting to Facts

Dimension Navigation

Loading Alpha Version

Throw-Away Code

Selecting and Preparing Sources

Generating Surrogate Keys

Simple Dimensions and Facts

Recap of Simple ETLs

Complicated Dimensions and Facts

Finalizing Alpha Structures

V&V of Alpha Version


Completing the Design

Unit of Measure

Metadata Mappings

Control Dimensions

Reinitializing the Warehouse

Data Sourcing

Source Mapping Challenges

Dimensionalizing Facts

Sourcing Your Data

Generalizing ETL Workflows

Standardizing Source Data

Source Data Intake Jobs

SDI Design Pattern

Source Data Consolidation

External versus Internal Sourcing

Single Point of Function

ETL "Pipes"

Metadata Transformation

Data Control Pipe

Wide versus Deep Data

ETL Reference Pipe

Metadata Transformation

Reference Composite

Resolve References

Unresolved References

Reference Entries

Alias Entries

Bridges and Groups

Hierarchy Entries

Fiat Hierarchies

Natural Hierarchies

ETL Definition Pipe

Processing Complexities

Example Master Loads

Insert New Definitions

New Orphans

Orphan Auto-Adoption

Definition Change Processing

Building SCD Transaction Sets

Applying Transactions to Dimensions

Performance Concerns

ETL Fact Pipe

Metadata Transformation

Bridges and Groups

Build Facts

Finalize Dimensions

Set Control Dimensions

Insert Fact Values

Superseding Facts

Finalizing Beta

Audit Trail Facts

Datafeed Dimension

Verification and Validation

Preparing for Gamma


Finalizing ETL Workflows

Alternatively Sourced Keys

Sourced Metadata

Standard Data Editing

Value-Level UOM

Undetermined Dimensionality

ETL Transactions

Target States

Superseded Facts

Continuous Functional Evolution

Establishing Data Controls

Finalizing Warehouse Design

Redaction Control Settings

Data Monitoring

Surrogate Merges

Security Controls

Implementing Dataset Controls

Warehouse Support Team

Building out the Data

Minimize Data Seams

Shifting toward Metrics

Populating Metric Values

Populating Control Values

Populating Displays

Delivering Data

Warehousing Use Cases

Privacy-Oriented Usage Profiles

Metadata Browsing

Cohort Identification

Fact Count Queries

Timeline Generation

Business Intelligence

Alternative Data Views

Finalizing Gamma

Business Requirements

Technical Challenges

Functional Challenges

Going Live


Knowledge Synthesis

Fact Counts

Derivative Data

Timeline Analysis

Statistical Analyses

Statistical Process Control

Semantic Annotation

Data Governance

Organizing for Governance

Governance Opportunities


About the Author

Richard E. Biehl is an information technology consultant with 37 years of experience, specializing in logical and physical data architectures, quality management, and strategic planning for the application of information technology. His research interests include semantic interoperability in biomedical data and the integration of chaos and complexity theories into the systems engineering of healthcare. Dr. Biehl holds a PhD in applied management and decision science and an MS in educational change and technology innovation from Walden University, Minneapolis, Minnesota. He is a certified Six Sigma Black Belt (CSSBB) and a Software Quality Engineer (CSQE) by the American Society for Quality (ASQ), Milwaukee, Wisconsin. Dr. Biehl is a visiting instructor at the University of Central Florida (UCF), Orlando, Florida, in the College of Engineering and Computer Science (CECS), teaching quality and systems engineering in the Industrial Engineering and Management Systems (IEMS) Department.

Subject Categories

BISAC Subject Codes/Headings:
BUSINESS & ECONOMICS / Industries / Service Industries
COMPUTERS / Information Technology
MEDICAL / Administration
SCIENCE / Biotechnology