Everyday Essbase: performance

Thursday, September 10, 2015

CALCPARALLEL with ASO

If you perform aggregations on your ASO cubes and you have enough resources on your server to allocate more than one processor then, by all means, change the CALCPARALLEL setting in your essbase.cfg file to a number higher than the default (2). Be sure to heed the recommendation to increase your ASO cache as each thread you allocate to the aggregation requires its own amount of cache.

If you were to go by the DBA Guidelines for cache (32MB of cache for each 2GB of input data) you would need 64MB of cache if you wanted to use the default number of threads when aggregating a 2GB ASO database.

Let's see how this translates to the real world.

I have an ASO cube with 9,999,840 KB of input level data. That should require, well, let's do the math:

First figure out how many times larger my input data is than 2GB: 9,999,840/2,097,152 = 4.768...

Now just take the square root of 4.768 = 2.183...

Now multiply that by 32MB = 69.876...

Hang on, if you read further down the page you see:

To improve the performance of building aggregates, take the following steps.

Increase the size of the aggregate storage cache to at least 512 MB or 20% of the input data size, whichever is smaller.

All that math for nothing. So I should set it to 512MB or 1,999MB (20% of my input size), whichever is smaller. That's easy. 512MB < 1,999MB.

<scratching head>So do I need to allocate 512MB per thread or is that in total? My guess is that's in total.

Once again, Let's see how this translates to the real world.

Let's just test it out and see the performance based upon a bunch of different settings. My CALCPARALLEL is set to 16 in this case.

Here's the MaxL script I'll use:

1:  alter database TEST.TEST clear aggregates;  
2:  alter application TEST set cache_size 32MB;  
3:  alter system unload application TEST;  
4:  execute aggregate process on database TEST.TEST stopping when total_size exceeds 1.1;

The Results

The table shows the cache size and respective aggregation process time. Each time 13 views were created. As you can see on the graph, there is a diminishing return after about 164 MB. That's well below the recommended 20% or 512MB guideline.

Cache Size (MB)	Aggregation Process Time (s)
32	753.78
64	383.53
70	334.61
100	283.98
128	257.95
164	241.38
200	236.79
256	233.31
512	234.8
1024	235.01
2000	228.21

Graph: Time in seconds is on the y-axis, cache size in MB is on the x-axis.

Tuesday, September 8, 2015

ASO Restructure Performance Part 2

As I showed in Part 1, you don't have a lot of control in speeding up ASO restructures. As with BSO, the more data you have in the cube, the longer the restructure will take. One tip for faster ASO restructures is to clear aggregate data. Once the restructure is complete you'll need to aggregate the data again but you can do this while the cube is live if you need to. Keeping this in mind, I thought I'd compare the performance for a few restructure options.

1. With Data and Aggregations.
2. With Input Data Only.
3. Export, Restructure, Load.

For my test model I'm using ASOsamp.Sample with 550MB of input data and when I aggregate I'm using the below command. This will give me aggregate views approximately 50% the size of the input data. To cause the restructure, I will add a member to the Measures dimension.

Execute aggregate process on database ASOsamp.Sample stopping when total_size exceeds 1.5;

Now the results (all times in seconds):

1. With input data and aggregations.

Restructure Time: 51.85s
Total Elapsed Time: 51.85s

2. With input data only.

Clear Aggregate Data Time: ~1s
Restructure Time: 26.12s
Create Aggregate Views Time: 48.71s
Total Elapsed Time: 75.83s

3. Export/Restructure/Load

Export Data Time: 43s
Clear Data Time: ~1s
Restructure Time: 1.35s
Load Data Time: 13.41s + 14.88s
Create Aggregate Views Time: 52.29s
Total Elapsed Time: 125.93s

I do find it interesting that having aggregated views that are 50% the size of the input data causes a nearly 100% increase in restructure time (26.12s vs. 51.85s). I think I need to do some more experimentation on this aspect of restructures.

In this example I wasn't able to gain better performance by restructuring with less data. I did run a few tests turning on parallel loads, calcs and exports. The results were slightly better (method 2 clocked in at about 66s) but the restructure with data and aggregations still won the day. I think there is reason to believe that with some larger input sets and a very powerful machine that method 2 could beat method 1.