Tuesday, September 8, 2015

ASO Restructure Performance Part 2

As I showed in Part 1, you don't have a lot of control in speeding up ASO restructures. As with BSO, the more data you have in the cube, the longer the restructure will take. One tip for faster ASO restructures is to clear aggregate data. Once the restructure is complete you'll need to aggregate the data again but you can do this while the cube is live if you need to. Keeping this in mind, I thought I'd compare the performance for a few restructure options.

1.  With Data and Aggregations.
2.  With Input Data Only.
3.  Export, Restructure, Load.


For my test model I'm using ASOsamp.Sample with 550MB of input data and when I aggregate I'm using the below command. This will give me aggregate views approximately 50% the size of the input data. To cause the restructure, I will add a member to the Measures dimension.

Execute aggregate process on database ASOsamp.Sample stopping when total_size exceeds 1.5;

Now the results (all times in seconds):

1. With input data and aggregations.

Restructure Time: 51.85s
Total Elapsed Time: 51.85s

2. With input data only.

Clear Aggregate Data Time: ~1s
Restructure Time: 26.12s
Create Aggregate Views Time: 48.71s
Total Elapsed Time: 75.83s

3. Export/Restructure/Load

Export Data Time: 43s
Clear Data Time: ~1s
Restructure Time: 1.35s
Load Data Time: 13.41s + 14.88s
Create Aggregate Views Time: 52.29s
Total Elapsed Time: 125.93s

I do find it interesting that having aggregated views that are 50% the size of the input data causes a nearly 100% increase in restructure time (26.12s vs. 51.85s). I think I need to do some more experimentation on this aspect of restructures.

In this example I wasn't able to gain better performance by restructuring with less data. I did run a few tests turning on parallel loads, calcs and exports. The results were slightly better (method 2 clocked in at about 66s) but the restructure with data and aggregations still won the day. I think there is reason to believe that with some larger input sets and a very powerful machine that method 2 could beat method 1.

No comments: