The Changing Nature of Data Movement
Since the start of the data warehousing era in the early 1990s, the extraction, transformation and load (ETL) process has been a batch process due to the high volume of data that must be loaded in a short period of time. Additionally, data warehouse applications were seldom viewed as mission critical, so a nightly batch window was perfectly acceptable--it still is for many applications.
High performance, batch data transformation and movement tools such as Informatica, Prism and ETI made their mark during this time, and the focus was on handling complex data transformations quickly. Those times have changed as the era of "real time data warehousing" has arrived.
Corporate executives no longer want to see yesterday's data today. They want to see today's data today. Data loads are no longer in the single gigabytes, but predominate in the tens or even hundreds of gigabytes. Users aren't just national, but rather span global time zones. In order to meet these changing requirements, data needs to be extracted, transformed and loaded into the data warehouse on a "real time" basis. The batch tools of the '90s are no longer sufficient to meet the "real time" needs of today.
Before delving into the right toolset for the job, let's explore why the batch tools won't meet the challenge. First, most batch tools need an extract file to execute.
Please log in or sign up below to read the rest of the article.
|
"The secret of life is honesty and fair dealing. If you can fake that, you've got it made." - Groucho Marx |




