Hello Dirk,
I have read in a blog :
A general data warehouse consists of dimension and fact tables. In a data warehouse the data loading into dimension tables are implemented using SCDs. Mostly SCD type 2 effective data is implemented to load dimension table. Once the dimension tables are loaded then the fact table is loaded with transactional data.
This article covers the process to load data into the fact table. Follow the below steps for loading data into the fact table.
First implement the SCD type 2 method to load data into the dimension table. As an example choose the SCD type 2 effective date to load data into the customer dimension table. The data in the customer dimension table looks as:
Table Name: Customer_Dim
Cust_key Cust_id Begin_date End_Date
------------------------------------
1 10 01-Jan-12 30-Jul-12
2 20 01-Apr-12 30-Aug-12
3 10 31-Jul-12 Null
4 20 31-Aug-12 Null
5 30 31-Aug-12 Null
Let say, you want to load the sales fact table. Consider the following source transactional (sales) data:
Table Name: sales
Cust_id Price
-------------
10 10000
20 50000
30 20000
When loading the data into the fact table, you have to get the relavant dimensional keys (surrogate keys) from all the dimension tables and then insert the records into the fact table. When getting the dimension keys from the dimension table we have to get the rows for which the End_date column is Null. The following SQL query shows how to load the data into the fact table
SELECT s.price, c.cust_key
FROM Customer_Dim c
sales s
WHERE c.cust_id = s.cust_id
AND c.end_date is null
After loading the data in the fact table looks as
Table Name: Sales_fact
Price Cust_key
--------------
10000 3
50000 4
30000 5
Dirk, in order to load the Fact Table I do not need any timestamp in Fact Table ( Only in Dimensión)
Do yo agree?
Even I do not need an incremental unique ID just doing the join between the Dimension and the source transactional (sales) data:
SELECT s.price, c.cust_key
FROM Customer_Dim c
sales s
WHERE c.cust_id = s.cust_id
AND c.end_date is null
"If the facts table at the source is changed by inserts only (no updates), most efficient approach is to use the incremental unique ID to define the delta."
Which method is much better ?
- To use the incremental unique ID to define the delta.
- Join between the Dimension and the source transactional (sales) data:
Please also answer me the previous post
Thanks in advance.
Best Regards
Antonio