Quantcast
Channel: SCN: Message List
Viewing all articles
Browse latest Browse all 2740

Re: HOW TO LOAD A FACT TABLE USING SOURCE BASED CDC

$
0
0

Hello Dirk,

 

I have read in  a blog :

 

A general data warehouse consists of dimension and fact tables. In a data warehouse the data loading into dimension tables are implemented using SCDs. Mostly SCD type 2 effective data is implemented to load dimension table. Once the dimension tables are loaded then the fact table is loaded with transactional data.

 

This article covers the process to load data into the fact table. Follow the below steps for loading data into the fact table.

 

First implement the SCD type 2 method to load data into the dimension table. As an example choose the SCD type 2 effective date to load data into the customer dimension table. The data in the customer dimension table looks as:

 

Table Name: Customer_Dim

Cust_key Cust_id Begin_date End_Date

------------------------------------

1             10        01-Jan-12       30-Jul-12

2             20        01-Apr-12       30-Aug-12

3             10       31-Jul-12           Null

4             20       31-Aug-12          Null

5             30       31-Aug-12          Null

 

 

 

Let say, you want to load the sales fact table. Consider the following source transactional (sales) data:

 

Table Name: sales

Cust_id Price

-------------

10       10000

20        50000

30        20000

 

 

When loading the data into the fact table, you have to get the relavant dimensional keys (surrogate keys) from all the dimension tables and then insert the records into the fact table. When getting the dimension keys from the dimension table we have to get the rows for which the End_date column is Null. The following SQL query shows how to load the data into the fact table

 

 

 

SELECT s.price, c.cust_key

FROM Customer_Dim c

  sales s

WHERE c.cust_id = s.cust_id

AND c.end_date is null

 

 

After loading the data in the fact table looks as

 

 

Table Name: Sales_fact

 

Price Cust_key

--------------

10000    3

50000    4

30000    5

 

 

Dirk,  in order to load the Fact Table  I do not need any timestamp in Fact Table ( Only in Dimensión)

 

Do yo agree?

 

Even I do not need  an incremental unique ID just doing the join between the Dimension and the source transactional (sales) data:

 

SELECT s.price, c.cust_key

FROM Customer_Dim c

  sales s

WHERE c.cust_id = s.cust_id

AND c.end_date is null

 

 

 

"If the facts table at the source is changed by inserts only (no updates), most efficient approach is to use the incremental unique ID to define the delta." 

 

 

Which method is much better ?

 

 

- To use the incremental unique ID to define the delta.

 

-  Join between the Dimension and the source transactional (sales) data:

 

Please also answer me the previous post

 

Thanks in advance.

 

Best Regards

 

Antonio


Viewing all articles
Browse latest Browse all 2740

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>