Open Source Data Warehousing and Business Intelligence: 1st Edition (Paperback) book cover

Open Source Data Warehousing and Business Intelligence

1st Edition

By Lakshman Bulusu

CRC Press

432 pages | 30 B/W Illus.

Purchasing Options:$ = USD
New in Paperback: 9781138374225
pub: 2018-09-10
Hardback: 9781439816400
pub: 2012-08-06
eBook (VitalSource) : 9780429093494
pub: 2012-08-06
from $47.98

FREE Standard Shipping!


Open Source Data Warehousing and Business Intelligence is an all-in-one reference for developing open source based data warehousing (DW) and business intelligence (BI) solutions that are business-centric, cross-customer viable, cross-functional, cross-technology based, and enterprise-wide. Considering the entire lifecycle of an open source DW & BI implementation, its comprehensive coverage spans from basic concepts all the way through to customization.

Highlighting the key differences between open source and vendor DW and BI technologies, the book identifies end-to-end solutions that are scalable, high performance, and stable. It illustrates the practical aspects of implementing and using open source DW and BI technologies to supply you with valuable on-the-project experience that can help you improve implementation and productivity.

Emphasizing analysis, design, and programming, the text explains best-fit solutions as well as how to maximize ROI. Coverage includes data warehouse design, real-time processing, data integration, presentation services, and real-time reporting. With a focus on real-world applications, the author devotes an entire section to powerful implementation best practices that can help you build customer confidence while saving valuable time, effort, and resources.


"… As a practitioner himself, Mr. Bulusu is able to pinpoint the more critical aspects for consideration that a BI expert would truly appreciate. He does a great job of covering the subject thoroughly by first starting at a high-level, covering every aspect of BI and then breaking down the components to the most granular detail applying open source BI’s feasibility and validity as a viable solution. … Bulusu’s thorough, meticulous approach and in-depth examination of the subject, taken from a standpoint of an experienced author, educator, and technologist, has finally provided justice to this subject. Thank you, Mr. Bulusu. You have opened my eyes to open source BI."

—Rosendo Abellera, President and CEO, BIS3

"… coverage spans from basic concepts to customization. It identifies end-to-end solutions and illustrates the practical aspects of implementing and using open source DW and BI technologies. The text explains best-fit solutions and how to maximize ROI. The expert in field, Lakshman Bulusu in this book devotes offers the best practices that will help you build customer confidence."

—NeoPopRealism Journal

Table of Contents


Data Warehousing and Business Intelligence: What, Why, How, When, When Not?

Taking IT Intelligence to Its Apex

Open Source DW and BI: Much Ado about Anything-to-Everything DW and BI, When Not, and Why So Much Ado?

Taking Business Intelligence to Its Apex: Intelligent Content for Insightful Intent

Data Warehousing and Business Intelligence: An Open Source Solution

What Is Open Source DW and BI, and How "Open" Is This Open?

What’s In, What’s Not: Available and Viable Options for Development and Deployment

Semantic Analytics

Testing for Optimizing Quality and Automation—Accelerated!

Business Rules, Real-World Perspective, Social Context

Personalization Through Customizable Measures

Leveraging the Cloud for Deployment

The Foundations Underneath: Architecture, Technologies, and Methodologies

Open Source versus Proprietary DW and BI Solutions: Key Differentiators and Integrators

Open Source DW and BI: Uses and Abuses

An Intelligent Query Accelerator Using an Open Cache In, Cache Out Design

Open Source DW & BI: Successful Players and Products

Open Source Data Warehousing and Business Intelligence Technology

Licensing Models Followed

Community versus Commercial Open Source

The Primary Vendors: Inventors and Presenters

Oracle: MySQL Vendor

PostgreSQL Vendor


Pentaho: Mondrian Vendor

Jedox: Palo Vendor

EnterpriseDB Vendor

Dynamo BI and Eigenbase: LucidDB Vendor

GreenPlum Vendor

Hadoop Project



The Primary Products and Tools Set: Inclusions and Exclusions

Open Source Databases

Open Source Data Integration

Open Source Business Intelligence

Open Source Business Analytics

The Primary Users: User, End-User, Customer and Intelligent Customer



Mondrian Customers

Palo Customers

EnterpriseDB Customers

LucidDB Customers

Greenplum Customers

Talend Customers


Analysis, Evaluation, and Selection

Essential Criteria for Requirements Analysis of an Open Source DW and BI solution

Key and Critical Deciding Factors in Selecting a Solution

The Selection-Action Preview

Raising your BIQ: Five Things Your Company Can Do Now

Evaluation Criteria for Choosing a Vendor-Specific Platform and Solution

The Final Pick: An Information-Driven, Customer-Centric Solution, and a Best-of-Breed Product/Platform and Solution Convergence Key Indicator Checklist


Design and Architecture: Technologies and Methodologies by Dissection

The Primary Aspects of DW and BI from a Usability Perspective: Strategic BI, Pervasive BI, Operational BI, and BI On-Demand

Design and Architecture Considerations for the Primary BI Perspectives

The Case for Architecture as a Precedence Factor

Information-Centric, Business-Centric, and Customer-Centric Architecture: A Three-in-One Convergence, for Better or Worse

Open Source DW and BI Architecture

Pragmatics and Design Patterns


Why and How an Open Source Architecture Delivers a Better Enterprise-wide Solution

Open Source Data Architecture: Under the Hood

Open Source Data Warehouse Architecture: Under the Hood

Open Source BI Architecture: Under the Hood

The Vendor/Platform Product(s)/Tools(s) That Fit into the Open DW and BI Architecture

Information Integration, Usability and Management (Across Data Sources, Applications and Business Domains)

EDW: Models to Management

BI: Models to Interaction to Management to Strategic Business Decision Support (via Analytics and Visualization)

Best Practices: Use and Reuse

Operational BI and Open Source

Why a Separate Chapter on Operational BI and Open Source?

Operational BI by Dissection

Design and Architecture Considerations for Operational BI

Operational BI Data Architecture: Under the Hood

A Reusable Information Integration Model: From Real- Time to Right Time

Operational BI Architecture: Under the Hood

Fitting Open Source Vendor/Platform Product(s)/Tools(s) into the Operational BI Architecture

Talend Data Integration

expressor 3.0 Community Edition

Advanced Analytics Engines for Operational BI

Astera’s Centerprise Data Integration Platform

Actuate BIRT BI Platform

JasperSoft Enterprise

Pentaho Enterprise BI Suite

KNIME (Konstanz Information Miner)

Pervasive DataRush

Pervasive DataCloud2

Best Practices: Use and Reuse

Development and Deployment

Development Options, Dissected

Deployment Options, Dissected

Integration Options, Dissected

Multiple Sources, Multiple Dimensions

DW and BI Usability and Deployment: Best Solution versus Best-Fit Solution

Leveraging the Best-Fit Solution: Primary Considerations

Better, Faster, Easier as the Hitchhiker’s Rule

Dynamism and Flash—Real Output in Real Time in the Real World


Better Responsiveness, User Adoptability, and Transparency

Fitting the Vendor/Platform Product(s)/tTools(s): A Development and Deployment Standpoint

Best Practices: Use and Reuse

Best Practices for Data Management

Best Fit of Open Source in EDW Implementation

Best Practices for Using Open Source as a BI-Only Methodology for Data/Information Delivery

Mobile BI and Pervasive BI

Best Practices for the Data Lifecycle in a Typical EDW Lifecycle

Data Quality, Data Profiling, and Data Loss Prevention Components

The Data Integration Component

Best Practices for the Information Lifecycle as It Moves into the BI Lifecycle

The Data Analysis Component: The Dimensions of Data Analysis in Terms of Online Analytics vs. Predictive Analytics vs. Real-Time Analytics vs. Advanced Analytics

Data to Information Transformation and Presentation

Best Practices for Auditing Data Access, as It Makes Its Way via the EDW and Directly Bypassing the EDW) to the BI Dashboard

Best Practices for Using XML in the Open Source EDW/BI Space

Best Practices for a Unified Information Integrity and Security Framework

Object to Relational Mapping: A Necessity or Just a Convenience?

Synchrony Maintenance

Dynamic Language Interoperability

Best Practices for Application Management

Using Open Source as an End-to-End Solution Option: How Best a Practice Is It?

Accelerating Application Development: Choice, Design, and Suitability Aspects

Visualization of Content: For Better or Best Fit

Best Practices for Autogenerating Code: A Codeless Alternative to Information Presentation

Automating Querying: Why and When

How Fine Is Fine-Grained? Drawing the Line between Representation of Data at the Lowest Level and a Best-Fit Metadata Design and Presentation

Best Practices for Application Integrity

Sharing Data between EDW and the BI Tiers: Isolation or a Tightrope Methodology

Breakthrough BI: Self-Serviceable BI via a Self-Adaptable Solution

Data-In, Data-Out Considerations: Data-to-Information I/O

Security Inside and Outside Enterprise Parameters: Best Practices for Security beyond User Authentication

Best Practices for Intra- and Interapplication Integration and Interaction

Continuous Activity Monitoring and Event Processing

Best Practices to Leverage Cloud-Based Methodologies

Best Practices for Creative BI Reporting

Best Practices Beyond Reporting: Driving Business Value

Advanced Analytics: The Foundation for a Beyond-Reporting Approach (Dynamic KPI, Scorecards, Dynamic Dashboarding, and Adaptive Analytics)

Large Scale Analytics: Business-centric and Technology-centric Requirements and Solution Options

Business-centric Requirements

Technology-centric Requirements

Accelerating Business Analytics: What to Look for, Look at, and Look Beyond

Delivering Information on Demand and Thereby Performance on Demand

Design Pragmatics

Demo Pragmatics

EDW/BI Development Frameworks

From the Big Bang to the Big Data Bang: The Past, Present, and Future

A Framework for BI Beyond Intelligence

Raising the Bar on BI Using Embeddable BI and BI in the Cloud

Raising the Bar on BI: Good to Great to Intelligent

Raising the Bar on the Social Intelligence Quotient (SIQ)

Raising the Bar on BI by Mobilizing BI: BI on the Go

A Pragmatic Framework for a Customer-Centric EDW/BI Solution

A Next-Generation BI Framework

Taking EDW/BI to the Next Level: An Open Source Model for EDW/BI–EPM

Open Source Model for an Open Source DW–BI/EPM Solution Delivering Business Value

Open Source Architectural Framework for a Best-Fit Open Source BI/EPM Solution

Value Proposition

The Road Ahead . . .

A BI Framework for a Reusable Predictive Analytics Model

A BI Framework for Competitive Intelligence: Time, Technology, and the Evolution of the Intelligent Customer

Best Practices for Optimization

Accelerating Application Testing: Choice, Design, and Suitability

Best Practices for Performance Testing: Online and On Demand Scenarios

A Fine Tuning Framework for Optimality

Looking Down the Customer Experience Trail, Leaving the Customer Alone: Customer Feedback Management (CFM)–Driven and APM-Oriented Tuning

Codeful and Codeless Design Patterns for Business-Savvy and IT-Friendly QOS Measurements and In-Depth Impact Analysis


Open Standards for Open Source: An EDW/BI Outlook




Each chapter includes an Introduction and Summary

About the Author

Lakshman Bulusu is a 20-year veteran of the IT industry with specialized expertise and academic experience in the management, supervision, mentoring, review, architectural design, and development of database, data warehousing, and business intelligence-related application development projects encompassing major industry domains such as pharmaceutical/healthcare, telecommunications, news/media, global investment and retail banking, insurance, and retail for clients across the United States, Europe, and Asia. He is well-versed in the primary Oracle technologies through Oracle11g, including SQL, PL/SQL, and SQL-embedded programming, as well as design and development of Web applications that are cross-platform and open source-based.

Mr. Bulusu has expertise in data modeling and design of enterprise data warehousing/business intelligence information architectures, with multiple customer implementations to his credit. His design of application development frameworks using PL/SQL, from design to coding to testing to debugging to performance tuning to business intelligence, has been implemented in some major Fortune 500 clients in the United States. He has implemented the Common Data Quality Framework for SQL Server, based on summarization-comparison-discrepancy isolation across disparate multivendor large-scale databases. He is also an educator who has been teaching technical courses for about a decade in the areas of Oracle design, development, and optimization, and he serves on the CNS Advisory Committee of Anthem Institute (affiliated to Anthem Education Group).

Mr. Bulusu has authored six books on Oracle and more than fifty educational/technical articles in journals and magazines in the United States and the United Kingdom; he has also presented at national and international conferences in the United States and the United Kingdom. He lives in New Jersey and likes to read, write, listen to, and lecture on English poetry and nonfiction when he is not working on IT projects. He can be reached at

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Database Management / General
COMPUTERS / Programming Languages / General
COMPUTERS / Software Development & Engineering / General