Results 1 to 4 of 4

Thread: Conceptual question about "ETL"

  1. #1
    Join Date
    Dec 2003

    Conceptual question about "ETL"


    A simple question but I just want to clarify my understanding:

    E = Extract
    T = Transform
    L = Load

    I clearly understand the "E" part which simply means extracting data from the data sources e.g. your OLTP systems or whatever into your Data Mart/ Data Warehouse.

    However I am now fully clear as to what do the "T" and the "L" really represent.

    Does the "T" (Transform) represent the processing that we do in our Datamart/DW (which some also refer to as the 'Staging Area') or ???
    Can you please help be conceptually be clear about this?

    Does the "L" (Load) part represent the process of updating our dimensions and Cubes?

    I'll appreciate your reply. Thanks

  2. #2
    Join Date
    Jan 2005
    The transformation (T of the ETL) is also known as the cleaning process. It is the part where you transform original data into the destination data. Ex. A code for a country coming from a legacy system A (123 for England) is transformed into UK. A code for the same country coming from a legacy system B (ENG for England) is also transformed into UK. Thus ensuring that your final application understands that you are talking about the same thing (unifying the data). We need to prepare the data before the loading process (L of the ETL). You can also apply business rules while transforming your data (ex. specify the account unary operator type etc...).

    The loading process uploads the data extracted and transformed so you are ready to use them into (most of the time) a multidimensional structure.

    If you want to know more about the terminology, I can only recommend you the book from Ralph Kimball "The datawarehouse The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd Edition" or "The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data". For me, this guy is really the pope of the datawarehouse !

  3. #3
    Join Date
    Sep 2002
    Extracting - Get Data from source

    This includes multiple steps like below.

    data scrubbing - Fixing incorrect or incomplete data
    data cleansing - remove duplicates, reformat and manipulate data
    data shaping - Fixing hierarchy and relationships
    data massaging - reformating to making the data consistent

    Loading - Move the transformed data to Target (Staging or directly to data warehouse).

  4. #4
    Join Date
    Dec 2003
    HI MAK and skitter,

    Thanks for your replies. Very useful.

    skittr, you have raised an interesting point when you say:

    "The loading process uploads the data extracted and transformed so you are ready to use them into (most of the time) a multidimensional structure."

    Can you kindly elaborate on why you say 'most of the time'? Does this in any manner imply that the result of doing all the steps of "ETL" does not necessarily mean that the users will be doing multi-dimensional analysis? If so, then why go through the pain of all these steps ??

    Hope my question is making somesense.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts