Components

Components of a successful Data Warehouse

Components of a successful Data Warehouse

 

* In this document, the term Data Warehouse also pertains to Data Marts and Operational Data Stores (ODS).

 

 

Business Reason (Pain)

Without it, there is no point in building a data warehouse.  There must be a reason or reasons the business community needs to resolve through the creation of a data warehouse. 

 

Business User Involvement 

Since they ultimately are the customers, they need to be involved with the plan, the design and the implementation. 

 

Education

The design team must market the concept to their potential users, who could be almost everyone at the company.  It is important that the user community understands the purpose of the data warehouse, and what it will and won't do.

 

Vision

There needs to be a business model that explains how the company will implement a data warehouse.  This model needs to be understood and accepted by key business and information system stakeholders.

 

Audience is known

The data warehouse users will include many types, such as casual users and power users. The casual users, who simply want to review the same data on a regular basis, now represents about 80% of a company’s decision support populace.  Those power users, who once ruled the decision support roost and pushed for self-service capabilities, now represent a mere 20% of users.1

 

Data Stewardship

A business manager ultimately owns the data that enters into the warehouse.  There are three new roles that are explained:

 

Data Steward: The Data Steward is someone from the business community who has responsibility for the data in the data warehouse.  They often need to generate political support for cleansing of data.

 

Business Data Contact: The Business Data Contact is a person or group to contact if users have business questions about a data object.  They are responsible for the business definition, allowable values and the procedures related to the data.  They handle the day-to-day issues and communication for the Data Steward.  They may also provide business requirements to the DW project team.

 

Technical Data Contact: The Technical Data Contact is the person or group to contact if users have technical questions about a data object.  The technical data contact is responsible for the technical and operational support of the data.

 

One Stop Shopping Concept

The business user already has to go to many different places to gather and analyze data.  The data warehouse should not be yet just “another” place to go.  It must be THE place to go for decision support information.

 

Organization

A new department needs to be created to support the data warehouse.  The most important part (where most companies fail) is to create a team made up of both I/S professionals and Business Analysts.

 

Most importantly, a support area must be established to handle data questions and enhancement requests. The database requires an ongoing support team for production problems, maintenance and enhancements to the extract programs and data structures. Tool questions should be handled through the normal process.

 

The recommendation is to create 3 high level teams.  These are:

 

·        Core Team

·        Development Team

·        Extended Team

 

Each team has multiple teams and/or departments within them. 

 

New Methodologies

There are three important methodologies that need to be created and implemented:

 

            Data Management Methodology

This methodology describes the steps to follow to add/modify data structures within the data warehouse.

 

Data Delivery Methodology

This methodology describes the steps to follow to add/modify data delivery mechanisms within the data warehouse.

 

            Technology and Training Methodology

This methodology describes the steps to get an end-user set up with access to the database, and getting them the appropriate data and tool training to effectively use the database.

 

Appropriate Technology

This includes the data extract, transform and load (ETL) tools, the database, and the query and reporting tools.  The database can be an ODS, a data warehouse and data marts or some combination of these.  Appropriate means it will work for the business users.  The technology may be already in place.

 

Prototyping

Before tables are built in a production data warehouse, test versions are created in a tool such as Microsoft Access.  This allows the user to be involved with the design and population of the table.  Design iterations can be accomplished quickly and easily before involving other technical staff such as DBAs.

 

Release Management

The concept behind Release Management is quite simple.  As requests come in from end-users to fix, enhance or newly create data structures, they are prioritized and logically grouped into “Releases”.  Each release is small enough to be completed quickly.  Before items are included in a release, there must be an operational source of data, and there must be resources to work on it.  Each release will follow the Data Management Methodology explained in this document.

 

Data Extraction Guidelines

Cleansing and transformation plans need to be developed.  Standard error and exception reporting needs to be designed and developed as well.

 

Naming Standards

All table names and column names must be user-friendly. Standard abbreviations should also be used.  All columns end with a suffix that explains what the column is.  For example:

 

Suffix

Class Word

Description

AT

Amount

A data item which denotes a monetary figure.

CD

Code

A data item which contains a specific range of values.

DT

Date

A data item which contains an actual date (MM/DD/YYYY).

DS

Description

A data item which contains descriptive text.

FL

Flag

A data item which contains 2 values such as Y/N, 0/1 or T/F.

NM

Name

A data item which contains the actual name or title of a person, place or thing.

NR

Number

A data item used to denote numeric values.

PT

Percentage

A data item which denotes a quantity as a percentage.

QY

Quantity

A data item which represents the numeric count of something.

TS

Timestamp

A data item that contains a system timestamp.  A timestamp includes both the date and time of an occurrence.

TX

Text

A data item that contains free form text.

 

 

Data Training

Access to the data warehouse must not be granted to anybody until they have graduated from Data Training school.  The business staff from the data warehouse support team provides the training. 2

 

Repository

This is where the business and technical definitions (meta data) are stored.  It is a database that is centrally located and available to everybody in the company.

 

Business Data Dictionary

A graphical application that is a front end browser to the Meta Data Repository.  It is designed to be very easy to use, and provides table and column definitions, along with data steward information to the business user.

 

Enhancement Request Database

Users need a place to enter their data enhancement requests, and the data warehouse support team needs a place to report what has been completed, what is in process and what is pending.  This database application allows the entry of new requests, and the reporting of new and existing requests.

 

Online Help

The business user needs a place to go to get online help.  This could be a word processing document, or a Lotus Notes database.  All documentation, diagrams, report inventory, trouble logs, etc need to be stored here.

 

Commitment from other Service Areas

Since the data warehouse support team relies on other areas (DBA’s, Data Analysts, etc) to complete it’s mission, these other areas must be committed to specific service levels to meet the users needs and requirements.

 

Support and Enhancements

The new data warehouse support team is a permanent fixture that will support the business users in data questions and enhancements.

 

Backup and Security

The data warehouse must be backed up on a regular basis, must have disaster recovery procedures, and must have built in security.

 

Usage Monitoring

After the data warehouse is up and running in a production environment, usage may vary dramatically and rapidly.  You will need tools to track and analyze information on usage patterns - by user, time and a host of other factors.

 

 

 

REFERENCES

 

1.      Eckerson, Wayne. “The Fallacy of Self-Service Decision Support”, DM Review January 1999.

2.      Rehm, Clay. “Data Warehouse Dos and Don’ts”, IDWA Navigator, Fall 1998.

 

Copyright © Rehm Technology LLC. All rights reserved.