Open Source Data Warehousing and Business Intelligence  book cover
1st Edition

Open Source Data Warehousing and Business Intelligence

ISBN 9781138374225
Published September 10, 2018 by CRC Press
432 Pages 30 B/W Illustrations

SAVE ~ $13.99
was $69.95
USD $55.96

Prices & shipping based on shipping country


Book Description

Open Source Data Warehousing and Business Intelligence is an all-in-one reference for developing open source based data warehousing (DW) and business intelligence (BI) solutions that are business-centric, cross-customer viable, cross-functional, cross-technology based, and enterprise-wide. Considering the entire lifecycle of an open source DW & BI implementation, its comprehensive coverage spans from basic concepts all the way through to customization.

Highlighting the key differences between open source and vendor DW and BI technologies, the book identifies end-to-end solutions that are scalable, high performance, and stable. It illustrates the practical aspects of implementing and using open source DW and BI technologies to supply you with valuable on-the-project experience that can help you improve implementation and productivity.

Emphasizing analysis, design, and programming, the text explains best-fit solutions as well as how to maximize ROI. Coverage includes data warehouse design, real-time processing, data integration, presentation services, and real-time reporting. With a focus on real-world applications, the author devotes an entire section to powerful implementation best practices that can help you build customer confidence while saving valuable time, effort, and resources.

Table of Contents

Data Warehousing and Business Intelligence: What, Why, How, When, When Not?
     Taking IT Intelligence to Its Apex
Open Source DW and BI: Much Ado about Anything-to-Everything DW and BI, When Not, and Why So Much Ado? 
     Taking Business Intelligence to Its Apex: Intelligent Content for Insightful Intent

Data Warehousing and Business Intelligence: An Open Source Solution
What Is Open Source DW and BI, and How "Open" Is This Open?
What’s In, What’s Not: Available and Viable Options for Development and Deployment 
     Semantic Analytics 
     Testing for Optimizing Quality and Automation—Accelerated! 
     Business Rules, Real-World Perspective, Social Context 
     Personalization Through Customizable Measures 
     Leveraging the Cloud for Deployment
The Foundations Underneath: Architecture, Technologies, and Methodologies 
     Open Source versus Proprietary DW and BI Solutions: Key Differentiators and Integrators
Open Source DW and BI: Uses and Abuses 
     An Intelligent Query Accelerator Using an Open Cache In, Cache Out Design

Open Source DW & BI: Successful Players and Products
Open Source Data Warehousing and Business Intelligence Technology 
     Licensing Models Followed 
     Community versus Commercial Open Source
The Primary Vendors: Inventors and Presenters 
     Oracle: MySQL Vendor 
     PostgreSQL Vendor 
     Pentaho: Mondrian Vendor 
     Jedox: Palo Vendor 
     EnterpriseDB Vendor 
      Dynamo BI and Eigenbase: LucidDB Vendor 
     GreenPlum Vendor 
     Hadoop Project 
The Primary Products and Tools Set: Inclusions and Exclusions 
     Open Source Databases 
     Open Source Data Integration
     Open Source Business Intelligence 
     Open Source Business Analytics
The Primary Users: User, End-User, Customer and Intelligent Customer
     Mondrian Customers
     Palo Customers 
     EnterpriseDB Customers 
     LucidDB Customers 
     Greenplum Customers 
     Talend Customers

Analysis, Evaluation, and Selection
Essential Criteria for Requirements Analysis of an Open Source DW and BI solution
Key and Critical Deciding Factors in Selecting a Solution 
     The Selection-Action Preview 
     Raising your BIQ: Five Things Your Company Can Do Now
Evaluation Criteria for Choosing a Vendor-Specific Platform and Solution
The Final Pick: An Information-Driven, Customer-Centric Solution, and a Best-of-Breed Product/Platform and Solution Convergence Key Indicator Checklist

Design and Architecture: Technologies and Methodologies by Dissection
The Primary Aspects of DW and BI from a Usability Perspective: Strategic BI, Pervasive BI, Operational BI, and BI On-Demand
Design and Architecture Considerations for the Primary BI Perspectives 
     The Case for Architecture as a Precedence Factor
Information-Centric, Business-Centric, and Customer-Centric Architecture: A Three-in-One Convergence, for Better or Worse
Open Source DW and BI Architecture 
     Pragmatics and Design Patterns 
Why and How an Open Source Architecture Delivers a Better Enterprise-wide Solution
Open Source Data Architecture: Under the Hood
Open Source Data Warehouse Architecture: Under the Hood
Open Source BI Architecture: Under the Hood 
The Vendor/Platform Product(s)/Tools(s) That Fit into the Open DW and BI Architecture 
     Information Integration, Usability and Management (Across Data Sources, Applications and Business Domains) 
     EDW: Models to Management 
     BI: Models to Interaction to Management to Strategic Business Decision Support (via Analytics and Visualization)
Best Practices: Use and Reuse

Operational BI and Open Source
Why a Separate Chapter on Operational BI and Open Source?
Operational BI by Dissection
Design and Architecture Considerations for Operational BI
Operational BI Data Architecture: Under the Hood
A Reusable Information Integration Model: From Real- Time to Right Time
Operational BI Architecture: Under the Hood
Fitting Open Source Vendor/Platform Product(s)/Tools(s) into the Operational BI Architecture 
     Talend Data Integration 
     expressor 3.0 Community Edition 
     Advanced Analytics Engines for Operational BI 
     Astera’s Centerprise Data Integration Platform 
     Actuate BIRT BI Platform 
     JasperSoft Enterprise 
     Pentaho Enterprise BI Suite
     KNIME (Konstanz Information Miner) 
     Pervasive DataRush
     Pervasive DataCloud2
Best Practices: Use and Reuse

Development and Deployment
Development Options, Dissected
Deployment Options, Dissected
Integration Options, Dissected
Multiple Sources, Multiple Dimensions
DW and BI Usability and Deployment: Best Solution versus Best-Fit Solution
Leveraging the Best-Fit Solution: Primary Considerations
Better, Faster, Easier as the Hitchhiker’s Rule 
     Dynamism and Flash—Real Output in Real Time in the Real World 
Better Responsiveness, User Adoptability, and Transparency
Fitting the Vendor/Platform Product(s)/tTools(s): A Development and Deployment Standpoint
Best Practices: Use and Reuse

Best Practices for Data Management
Best Fit of Open Source in EDW Implementation
Best Practices for Using Open Source as a BI-Only Methodology for Data/Information Delivery
     Mobile BI and Pervasive BI
Best Practices for the Data Lifecycle in a Typical EDW Lifecycle
     Data Quality, Data Profiling, and Data Loss Prevention Components
     The Data Integration Component
Best Practices for the Information Lifecycle as It Moves into the BI Lifecycle 
     The Data Analysis Component: The Dimensions of Data Analysis in Terms of Online Analytics vs. Predictive Analytics vs. Real-Time Analytics vs. Advanced Analytics 
     Data to Information Transformation and Presentation
Best Practices for Auditing Data Access, as It Makes Its Way via the EDW and Directly Bypassing the EDW) to the BI Dashboard
Best Practices for Using XML in the Open Source EDW/BI Space
Best Practices for a Unified Information Integrity and Security Framework
Object to Relational Mapping: A Necessity or Just a Convenience? 
     Synchrony Maintenance 
     Dynamic Language Interoperability

Best Practices for Application Management 
Using Open Source as an End-to-End Solution Option: How Best a Practice Is It?
Accelerating Application Development: Choice, Design, and Suitability Aspects 
     Visualization of Content: For Better or Best Fit 
     Best Practices for Autogenerating Code: A Codeless Alternative to Information Presentation 
     Automating Querying: Why and When 
     How Fine Is Fine-Grained? Drawing the Line between Representation of Data at the Lowest Level and a Best-Fit Metadata Design and Presentation
Best Practices for Application Integrity 
     Sharing Data between EDW and the BI Tiers: Isolation or a Tightrope Methodology 
     Breakthrough BI: Self-Serviceable BI via a Self-Adaptable Solution 
     Data-In, Data-Out Considerations: Data-to-Information I/O 
     Security Inside and Outside Enterprise Parameters: Best Practices for Security beyond User Authentication
Best Practices for Intra- and Interapplication Integration and Interaction 
     Continuous Activity Monitoring and Event Processing 
     Best Practices to Leverage Cloud-Based Methodologies
Best Practices for Creative BI Reporting

Best Practices Beyond Reporting: Driving Business Value
Advanced Analytics: The Foundation for a Beyond-Reporting Approach (Dynamic KPI, Scorecards, Dynamic Dashboarding, and Adaptive Analytics)
Large Scale Analytics: Business-centric and Technology-centric Requirements and Solution Options 
     Business-centric Requirements 
     Technology-centric Requirements
Accelerating Business Analytics: What to Look for, Look at, and Look Beyond
Delivering Information on Demand and Thereby Performance on Demand
     Design Pragmatics 
     Demo Pragmatics

EDW/BI Development Frameworks
From the Big Bang to the Big Data Bang: The Past, Present, and Future
A Framework for BI Beyond Intelligence 
     Raising the Bar on BI Using Embeddable BI and BI in the Cloud 
     Raising the Bar on BI: Good to Great to Intelligent 
     Raising the Bar on the Social Intelligence Quotient (SIQ) 
     Raising the Bar on BI by Mobilizing BI: BI on the Go
A Pragmatic Framework for a Customer-Centric EDW/BI Solution
A Next-Generation BI Framework 
     Taking EDW/BI to the Next Level: An Open Source Model for EDW/BI–EPM 
      Open Source Model for an Open Source DW–BI/EPM Solution Delivering Business Value 
     Open Source Architectural Framework for a Best-Fit Open Source BI/EPM Solution 
     Value Proposition
     The Road Ahead . . .
A BI Framework for a Reusable Predictive Analytics Model
A BI Framework for Competitive Intelligence: Time, Technology, and the Evolution of the Intelligent Customer

Best Practices for Optimization
Accelerating Application Testing: Choice, Design, and Suitability
Best Practices for Performance Testing: Online and On Demand Scenarios
A Fine Tuning Framework for Optimality
Looking Down the Customer Experience Trail, Leaving the Customer Alone: Customer Feedback Management (CFM)–Driven and APM-Oriented Tuning
Codeful and Codeless Design Patterns for Business-Savvy and IT-Friendly QOS Measurements and In-Depth Impact Analysis

Open Standards for Open Source: An EDW/BI Outlook


Each chapter includes an Introduction and Summary

View More



Lakshman Bulusu is a 20-year veteran of the IT industry with specialized expertise and academic experience in the management, supervision, mentoring, review, architectural design, and development of database, data warehousing, and business intelligence-related application development projects encompassing major industry domains such as pharmaceutical/healthcare, telecommunications, news/media, global investment and retail banking, insurance, and retail for clients across the United States, Europe, and Asia. He is well-versed in the primary Oracle technologies through Oracle11g, including SQL, PL/SQL, and SQL-embedded programming, as well as design and development of Web applications that are cross-platform and open source-based.

Mr. Bulusu has expertise in data modeling and design of enterprise data warehousing/business intelligence information architectures, with multiple customer implementations to his credit. His design of application development frameworks using PL/SQL, from design to coding to testing to debugging to performance tuning to business intelligence, has been implemented in some major Fortune 500 clients in the United States. He has implemented the Common Data Quality Framework for SQL Server, based on summarization-comparison-discrepancy isolation across disparate multivendor large-scale databases. He is also an educator who has been teaching technical courses for about a decade in the areas of Oracle design, development, and optimization, and he serves on the CNS Advisory Committee of Anthem Institute (affiliated to Anthem Education Group).

Mr. Bulusu has authored six books on Oracle and more than fifty educational/technical articles in journals and magazines in the United States and the United Kingdom; he has also presented at national and international conferences in the United States and the United Kingdom. He lives in New Jersey and likes to read, write, listen to, and lecture on English poetry and nonfiction when he is not working on IT projects. He can be reached at [email protected]


"… As a practitioner himself, Mr. Bulusu is able to pinpoint the more critical aspects for consideration that a BI expert would truly appreciate. He does a great job of covering the subject thoroughly by first starting at a high-level, covering every aspect of BI and then breaking down the components to the most granular detail applying open source BI’s feasibility and validity as a viable solution. … Bulusu’s thorough, meticulous approach and in-depth examination of the subject, taken from a standpoint of an experienced author, educator, and technologist, has finally provided justice to this subject. Thank you, Mr. Bulusu. You have opened my eyes to open source BI."
—Rosendo Abellera, President and CEO, BIS3

"… coverage spans from basic concepts to customization. It identifies end-to-end solutions and illustrates the practical aspects of implementing and using open source DW and BI technologies. The text explains best-fit solutions and how to maximize ROI. The expert in field, Lakshman Bulusu in this book devotes offers the best practices that will help you build customer confidence."
—NeoPopRealism Journal