Hello Friends!
SSAS Cubes are increasingly becoming bigger and bigger and thus processing the cube using “Process Full” option is not feasible in all scenarios.
Considering the importance of Incremental Cube Processing, I wanted to extend the AdventureWorks sample cube to include Incremental Cube Processing
However I could not find much material on this topic on the internet.
So, I thought to experiment a bit using the same concept that we use to incrementally load a data warehouse.
Typically before loading a fact table, we check the Last Processed row(date typically) in the fact table and only bring in the rows from the source that were either created/modified after that time
In this blogpost, we will look at how to extend Adventure Works sample cube to allow incremental processing of Internet Sales Measure Group
Steps to configure Incremental cube processing for InternetSales measure group
- Connect to SQL Server AdventureWorksDW database
in SSMS and add an identity key to FactInternetSales table using the following
query. This new column can be used to track the rows that have not been loaded into the cube yet
ALTER TABLE FactInternetSales ADD id INT IDENTITY
2. The following steps need to be done in SSDT on AdventureWorks cube
a. Refresh the Adventure Works DW.dsv in SSDT to
bring in the new id field that has been added to FactInternetSales table
b. Add the id measure to InternetSales Measure group and select the aggregation usage as Maximum as given in the screenshot below
This measure can be used to track the data that has been last loaded into the cube
c. Rename the measure as Internet Sales Id
d. Edit the existing Partitions in InternetSales measure group to include the new column in the query.
SSDT does not automatically refresh the queries used in partitions and this needs to
be manually done.
e. Adventure Works Cube sample has four partitions for InternetSales Measure group partitioned by the Year of the order creation
2005, 2006, 2007 and 2008.
For the purpose of this blog, we will end date the 2008 partition with 20081231 and create a new partition for incremental processing
Edit the 2008 partition to include the following WHERE condition
WHERE OrderDateKey >= '20080101' AND OrderDateKey <= '20081231'
Create a new open ended partition and name it Internet_Sales_Current with the following query
SELECT [dbo].[FactInternetSales].[ProductKey],[dbo].[FactInternetSales].[OrderDateKey],[dbo].[FactInternetSales].[DueDateKey],[dbo].[FactInternetSales].[ShipDateKey],[dbo].[FactInternetSales].[CustomerKey],[dbo].[FactInternetSales].[PromotionKey],[dbo].[FactInternetSales].[CurrencyKey],[dbo].[FactInternetSales].[SalesTerritoryKey],[dbo].[FactInternetSales].[SalesOrderNumber],[dbo].[FactInternetSales].[SalesOrderLineNumber],[dbo].[FactInternetSales].[RevisionNumber],[dbo].[FactInternetSales].[OrderQuantity],[dbo].[FactInternetSales].[UnitPrice],[dbo].[FactInternetSales].[ExtendedAmount],[dbo].[FactInternetSales].[UnitPriceDiscountPct],[dbo].[FactInternetSales].[DiscountAmount],[dbo].[FactInternetSales].[ProductStandardCost],[dbo].[FactInternetSales].[TotalProductCost],[dbo].[FactInternetSales].[SalesAmount],[dbo].[FactInternetSales].[TaxAmt],[dbo].[FactInternetSales].[Freight],[dbo].[FactInternetSales].[CarrierTrackingNumber],[dbo].[FactInternetSales].[CustomerPONumber],CONVERT ( CHAR ( 10 ), SalesOrderNumber ) + 'Line ' + CONVERT ( CHAR ( 4 ), SalesOrderLineNumber ) AS [SalesOrderDesc],[dbo].[FactInternetSales].[OrderDate],[dbo].[FactInternetSales].[DueDate],[dbo].[FactInternetSales].[ShipDate],[dbo].[FactInternetSales].[id] FROM [dbo].[FactInternetSales] WHERE OrderDateKey > '20090101'
f. Deploy the changes to the cube and perform a Process Full to load the data into the cube
3. Now we are ready to perform the Incremental
Processing of the new partition that we created.
To test this, we add a new record into
AdventureWorksDW.dbo.FactInternetSales table using the following query. This would generate a row with a new id
INSERT INTO [dbo].[FactInternetSales] ([ProductKey] ,[OrderDateKey] ,[DueDateKey] ,[ShipDateKey] ,[CustomerKey] ,[PromotionKey] ,[CurrencyKey] ,[SalesTerritoryKey] ,[SalesOrderNumber] ,[SalesOrderLineNumber] ,[RevisionNumber] ,[OrderQuantity] ,[UnitPrice] ,[ExtendedAmount] ,[UnitPriceDiscountPct] ,[DiscountAmount] ,[ProductStandardCost] ,[TotalProductCost] ,[SalesAmount] ,[TaxAmt] ,[Freight] ,[CarrierTrackingNumber] ,[CustomerPONumber] ,[OrderDate] ,[DueDate] ,[ShipDate]) SELECT TOP 1 [ProductKey] ,[OrderDateKey] ,[DueDateKey] ,[ShipDateKey] ,[CustomerKey] ,[PromotionKey] ,[CurrencyKey] ,[SalesTerritoryKey] ,[SalesOrderNumber] ,[SalesOrderLineNumber] + 1 ,[RevisionNumber] ,[OrderQuantity] ,[UnitPrice] ,[ExtendedAmount] ,[UnitPriceDiscountPct] ,[DiscountAmount] ,[ProductStandardCost] ,[TotalProductCost] ,[SalesAmount] ,[TaxAmt] ,[Freight] ,[CarrierTrackingNumber] ,[CustomerPONumber] ,getdate() ,[DueDate] ,[ShipDate] FROM AdventureWorksDW2012.dbo.FactInternetSales ORDER BY Id desc
4. Add a linked server to Analysis Services Server from SQL Server that hosts the AdventureWorksDW2012 database.
Replace OLAPSERVERNAME with Analysis Services Server name and OLAPDBName with AdventureWorks OLAP database name for your instance
5. Connect to the AdventureWorks Cube from SSMS
a. Right click on the partition that we had created
and click on Process
b. Select Process Add from the process options and click on Configure
c. Specify the source as query and input the following query .
The key is in the where clause and the using the linked server created,
we are finding out the last [id] that was loaded into the Measure group to retrieve the delta records that have to be loaded
SELECT [dbo].[FactInternetSales].[ProductKey],[dbo].[FactInternetSales].[OrderDateKey],[dbo].[FactInternetSales].[DueDateKey],[dbo].[FactInternetSales].[ShipDateKey],[dbo].[FactInternetSales].[CustomerKey],[dbo].[FactInternetSales].[PromotionKey],[dbo].[FactInternetSales].[CurrencyKey],[dbo].[FactInternetSales].[SalesTerritoryKey],[dbo].[FactInternetSales].[SalesOrderNumber],[dbo].[FactInternetSales].[SalesOrderLineNumber],[dbo].[FactInternetSales].[RevisionNumber],[dbo].[FactInternetSales].[OrderQuantity],[dbo].[FactInternetSales].[UnitPrice],[dbo].[FactInternetSales].[ExtendedAmount],[dbo].[FactInternetSales].[UnitPriceDiscountPct],[dbo].[FactInternetSales].[DiscountAmount],[dbo].[FactInternetSales].[ProductStandardCost],[dbo].[FactInternetSales].[TotalProductCost],[dbo].[FactInternetSales].[SalesAmount],[dbo].[FactInternetSales].[TaxAmt],[dbo].[FactInternetSales].[Freight],[dbo].[FactInternetSales].[CarrierTrackingNumber],[dbo].[FactInternetSales].[CustomerPONumber],CONVERT ( CHAR ( 10 ), SalesOrderNumber ) + 'Line ' + CONVERT ( CHAR ( 4 ), SalesOrderLineNumber ) AS [SalesOrderDesc],[dbo].[FactInternetSales].[OrderDate],[dbo].[FactInternetSales].[DueDate],[dbo].[FactInternetSales].[ShipDate],[dbo].[FactInternetSales].[id] FROM [dbo].[FactInternetSales] WHERE id > (select * from openquery(AdvWorksOLAP, 'select [Measures].[InternetSalesId] on columns from [Adventure Works]'))
d. Click on OK to complete the incremental processing of the cube.
6. Browse the cube to make sure that the record with new id in Internet Sales has been loaded into the cube
This completes the process of incrementally processing the latest partition of Internet Sales Measure Group.
This can be automated further to be done using an XMLA script in a SQL Server Agent Job.
I am sure this is only one of the ways to incrementally load the Measure group and I would like to hear comments
from others on options that work for them for incremental cube processing.
Regards
Veerendra Thati
Like us on FaceBook | Follow us on Twitter
Join the fastest growing SQL Server group on FaceBook
Good one sir
A question though, what if new data comes in its related dimensions. Do we have to process all the partitions once again?
Is there a way to do a incremental process in such cases also?
Hi Sreekar,
The article is about bringing in the new data into the measure group. If new data comes into related dimensions, they have to be brought in using ProcessAdd/ProcessUpdate on the dimension. The other question that you may arise is what happens when the Fact comes before the dimension is processed. Those would be going into the (Unknown) bucket and would not be corrected until the next ProcessFull on the Partition. So its important to control when dimensions are loaded during the ETL in the warehouse to avoid such scenarios during Process Incremental of a measure group
Hope that addresses your question
Regards
Veera