Overview

What is a Data Warehouse?

Data Warehouse

A data warehouse is a collection of data physically separated from the operational environment that solves real business pain, and contains data of interest to multiple user groups.  It provides historical and summarized data that supports the type of questions a typical business analyst, manager or executive might ask.  

The process of building a warehouse is achieved in an evolutionary, step-at-a-time fashion.  It is important to understand the data warehouse will not resolve the data duplication and inefficiencies of the operational environment.

The data warehouse is comprised of data that is stored over a well-defined period of time thus enabling trend analysis to be carried out.  The organization of the data warehouse is intended to support management with the information they need to run their business.  This information is stored in such a way as to facilitate user-defined analysis. The data warehouse is not intended to be used as the day to day operational system for the enterprise.  There is not a requirement for the data in the warehouse to be real-time, it only needs to be as current as the analysis requirements dictate.

 Operational Data Store

An operational data store (ODS) is an architectural construct that looks very much like a data warehouse.  An ODS is subject oriented, integrated, volatile, current valued. The purpose of an ODS is to act as a "weigh station" to temporarily store any data that is in need of additional processing before it is ready to be committed to the data warehouse. 

The ODS is designed and organized around the major subjects of the corporation.  The major subjects are typically things such as CUSTOMER, PRODUCT, ACTIVITY, POLICY and CLAIM.

 The data found in the ODS is an aggregation of detailed data found in the legacy systems that feed it. As the data is pulled into the ODS from the legacy systems, the data is fundamentally transformed into a consistent, unified whole.

Data Mart

A Data Mart is similar to the data warehouse in the sense that they both contain information to be used by the enterprise. The information contained in the data marts tends to be summarized, however it can also contain detailed information.

A data mart is a subset of information extracted only from the Data Warehouse. Users must be discouraged from creating data marts without using the Data Warehouse as the source. If the Data Warehouse does not exist, then the existing or new data marts must merge together to create the data warehouse.

 A data mart is usually created for a specific department or individual need.

 Internal and External Unstructured Data

Unstructured data is meant to represent many different forms of data that currently exist or are required for the company to operate.  This data is typically from external sources buy may also by sourced from internal applications or data stores.  This type of information generally requires a more complex transformation and cleansing process than would be typically involved with a standard source system.

Copyright © Rehm Technology LLC. All rights reserved.