Books
in black and white
Main menu
Share a book About us Home
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

The data worehouse ETL Toolkit - Kimpball R.

Kimpball R., Caserta J. The data worehouse ETL Toolkit - John Wiley & Sons, 2005. - 526 p.
SBN: 0-764-57923-1
Download (direct link): thedatawarehouse2005.pdf
Previous << 1 .. 199 200 201 202 203 204 < 205 > 206 207 208 209 210 211 .. 233 >> Next

(indicate if this is a new requirement or a change to an existing requirement)
Submitter: ___________________________ Owner: ______________________________
Version found in: _________________________________
Functional Description (attachments):
Technical Description (attachments):
All fields are required
Figure 10.7 Change/enhancement requisition form.
416 Chapter 10
sure to unit, QA, and UAT test your changes. After the changes have passed the testing cycles, either the ETL architect migrates the routines or the DBA team pushes the code to production.
Tracking versions of your data warehouse is beneficial for troubleshooting problems discovered in production. Use the tracking mechanisms outlined earlier in this chapter to maintain control over your version releases. Normally, following standard-versioning techniques works well in the data warehouse/ETL environment. It is especially important for the ETL manager to adhere to this standard because much of the data warehouse code releases to production are created and deployed by the ETL team.
The version number consists of a series of three decimal delimited numbers (##.##.##). The first set of numbers signifies major releases; the second, minor releases; and the third, patches. For example, Version 1.2.1 means the data warehouse is in its first version and there have been two minor releases and one patch applied to it.
In the data warehouse environment, a major version release typically constitutes a new subject area or data mart that includes new facts, dimensions, and ETL processes. A minor release is defined as primarily ETL modifications, possibly including some minor structural database changes. Patches are usually a result of a hotfix, where a mission-critical error has been detected in the production environment and needs to be corrected immediately. If patches are bundled with minor changes or minor changes with a major, only the leftmost number in the series should be incremented and the right-hand numbers are reset. For example, if version 1.2.1 is in production and you have two patches, a minor change, and a major release scheduled for migration, bundling these changes would be considered a single major release. In this case, you would now be at release 2.0.0.
It is good practice to bundle and schedule major releases with enough time between to address hot fixes. With scheduled major releases, perhaps monthly, it is easier to bundle minor fixes into the controlled release environment to minimize code migrations.
Our recommended data warehouse versioning strategy is especially powerful when your project is using the data warehouse bus architecture. In such a case, each data mart in the bus matrix will be a major version release as it enters the physical data warehouse. If your data warehouse is at version 1.0.210, you are most likely not using this matrix and probably not sleeping at night, either.
Summary
In this chapter, we have finally stepped back a little from the myriad tasks of the ETL team to try to paint a picture of who the players are and what are
Responsibilities 417
they supposed to think about. We must keep in mind that this chapter and really the whole book are deliberately limited to the back-room concerns of the enterprise data warehouse.
We began by describing the planning and leadership challenges faced by the ETL team; then we descended into the specific tasks that these people face. In many cases, much more detail is provided in the main text of the book.
IV
Real Time Streaming
ETL Systems
CHAPTER
11
Real-Time ETL Systems
Building a real-time data warehouse ETL solution demands classifying some often slippery business objectives, understanding a diverse set of technologies, having an awareness of some pragmatic approaches that have been successfully employed by others, and developing engineering flexibility and creativity. This field remains young, with new technologies, emergent methodologies, and new vocabularies. Clearly, this situation can be a recipe for trouble, but real-time data warehousing also offers early adopters great potential to gain a competitive advantage—an intriguing risk versus reward trade-off. This chapter proposes a four-step process to guide the experienced data warehousing professional through the selection of an appropriate real-time ETL technical architecture and methodology:
1. This chapter examines the historical and business contexts of the state of the art real-time data warehouse—providing some How did we get here? and Where are we going? background.
2. Next, it describes a method for classifying your organization's real-time requirements in a manner that is most useful for selecting design solutions later.
3. The heart of the chapter is an appraisal of several mechanisms for delivering real-time reporting and integration services, the technologies most appropriate for each approach, and their strengths and weaknesses.
Previous << 1 .. 199 200 201 202 203 204 < 205 > 206 207 208 209 210 211 .. 233 >> Next