Wednesday, September 30, 2015

Comparison of a Stored and Dynamic ASO Hierarchy

I've heard from many people on many occasions that when designing an ASO database you should make as many dimensions Stored as possible. Just how much will this improve performance?

There are many different way this can be tested. I will start out with a simple test case to gauge retrieval performance. I have four scenarios in mind for use with the ASOsamp database:

1. Store dimension set to Stored. No aggregations.
2. Store dimension set to Dynamic. No aggregations.
3. Store dimension set to Stored. Recommended aggregations.
4. Store dimension set to Dynamic. Recommended aggregations.

I will run two different queries to measure retrieval performance. Query 1 will be the default retrieve. Query 2 will zoom to all levels of the Store dimension with the following grid (all twelve months are not shown for size purposes).


The Results


The Analysis

For Query 1 with no aggregations, the Stored dimension is faster but on Query 2, the Dynamic dimension is faster. With aggregations, Query 1 is the same between Stored and Dynamic. With aggregations, Query 2 is slightly faster with the Dynamic dimension than with Stored.

I have no idea why the Dynamic dimension is faster for Query 2. Any reason I could come up with would be purely speculative. Is it the size of the input data, the distribution of the data, the size of the dimension, the full moon? I have no clue. I did rerun these tests many times so it wasn't just a fluke -- the results were consistent. I'm going to have to explore this further with more testing. Any insights or ideas on further test cases on this would be appreciated.

Monday, September 28, 2015

Measuring NONEMPTYMEMBER/TUPLE in MDX Part 2

Welcome to MDX Monday. I'll pick up where I left off in Measuring NONEMPTYMEMBER/TUPLE in MDX Part 1 last Monday. I ended that part with the desire to test out the same non-empty directives within member formulas. In this test I will use the calculation portion of the query in Part 1 and place it into a member formula. Then I'll try using NONEMPTYTUPLE and NONEMPTYMEMBER in order to see if there is a difference in performance in the ASOsamp database.

The Member Formula

NONEMPTYTUPLE ([Units], [MTD])
  Sum(
       {
        ClosingPeriod(Time.Generations(5), Time.CurrentMember),
        Time.CurrentMember.Lag(1),
        Time.CurrentMember.Lag(2)
        }, 
        Units
   )

The Tests

In this test I will duplicate the data pull that I used in Part 1 by zooming to the bottom on the Store Manager and Stores dimensions with the month of March selected. These are the four tests. Each retrieval will be identical.

1. Run the example as is, including the NONEMPTYTUPLE statement.
2. Remove the NONEMPTYTUPLE statement.
3. Change the NONEMPTYTUPLE ([Units],[MTD]) to NONEMPTYMEMBER [Units]
4. Change the NONEMPTYTUPLE ([Units],[MTD]) to NONEMPTYMEMBER [MTD]

The query time results are below. Each returned the exact same set.

The Results

1. 0.39s
2. 9.72s
3. 0.29s
4. 9.7s


Note: I got the same warning in the log for the 3 Month Units member as I got in Part 1 during the MDX query execution for Test 4. 

Warning: Non empty member directive [MTD] for member [3 Month Units] will be ignored since it is from a different dimension.

Analysis and Conclusions and Another Test

These results were almost exactly in line with the results in Part 1. They were all +.07 or +.08 the respective test in Part 1, the difference probably attributed to overhead/network latency associated with Smart View. The non-empty directive can help ASO member formulas. I wondered if it had something to do with the fact that an attribute dimension (Store Manager) was used in the query. So I removed that and reran the test.

1. 0.34s
2. 0.39s
3. 0.24s
4. 0.38s

It looks like the attribute dimension was hurting the performance when not using the non-empty directive. The difference in performance without the attribute dimension is very minor but there still is a difference albeit very small. The bottom line is that you should probably at least try to use the NONEMPTYMEMBER and NONEMPTYTUPLE when the application log suggests it.

Friday, September 25, 2015

Sweet Home Chicago


I'd like to take this laid back Friday to talk about Chicago and Kscope. I attended my first Kscope back in June. I hadn't been to a conference in many years (the last conference I spoke at was in 2000 -- discussing the upgrade to Essbase 6!). It was time to check out this Kscope everyone was raving about. I was still skeptical but I submitted an abstract and it was accepted. Little did I know what I was in store for. This was unlike any conference I had been to. There was emphasis on actual technical content. I talked with people who have been living and breathing Essbase since it ran on OS/2. It reinvigorated me and rekindled my passion for this technology. I saw all the cool things people were doing and it made me want to do better. It made me want to explore new things like HSO.

If you're reading this blog, you're probably using Oracle EPM in some way. You have to go to Kscope16. I want to meet you and hear about your implementations -- about how you overcame some seemingly insurmountable obstacle. I want to hear your presentations. So get going, write up a killer abstract and submit it. The deadline is October 15th.

In case you need some more motivation, I can personal attest to the fact that late June in Chicago is the best time all year to visit. If you've never been here, you have to come. Chicago weather in June is just about perfect. Chicago is filled with world class parks, museums, architecture and restaurants. So get that abstract in next week and I'll see you in June.

Thursday, September 24, 2015

Displaying Timestamps in Essbase

One common requirement in this age of SOX is to show users how current the data is in a cube. This is especially important when data is being updated during the business day. There are many reasons the users would want to know that data has been updated. I've used a few techniques over the years to do this.

Days Gone By

I used to like to set the timestamp in the database note which was easily accessible via the Excel Add-In. That's not an option now that Smart View has taken over the world. It also wouldn't allow you to get the timestamp printed on your report. UPDATE: The database note is available beginning in 11.1.2.2 release of Smart View.

Member Alias

This is probably the simplest method but doesn't help you if you're updating data during the day. You can simply add a member called "Timestamp" and update it's alias when you rebuild your outline. You can either do this via a load rule or using the alter database command with load alias_table options (it's not listed under the ASO alter database specification but I haven't had any issues using it with ASO).



Substitution Variable

The idea here is to set a Substitution Variable to the decimal value that Excel uses to store time. If you use the NOW() function in Excel and format it to a decimal value you'll see a number somewhere north of 42,000. As of September 24th, 2015 it is 42,272 and a remainder. This tells us that it has been 42,272 days since January 1, 1900. The decimal portion shows how far along in the day we are. For example, 42,272.5 corresponds to September 24th, 2015 at 12:00 Noon. We can create this number using the shell if we're on Linux with the following line:

echo "scale=10;(`date +%s`/86400)+25569" |bc

The code within the quotes gets piped to a program called "bc" which is a Unix calculator. The echo command prints the command to standard output. The scale=10 command sets the output to 10 decimal places. The `date +%s` gives us the number of seconds since January 1, 1970 (the Epoch). Dividing it by 86,400 (the number of seconds in a day) gives us the number of days since the Epoch. 25,569 is the number of days between January 1st, 1900 (Excel time) and January 1st, 1970 (the Epoch). All that together gives us the number we need in Excel. You can use that to send as a variable to a MaxL statement to set a substitution variable.



You can change your Timestamp member formula to reference the substitution variable (&_Timestamp in ASO for example). That way when someone wants to include the value, they add the Timestamp member to their sheet, perform a retrieve and then format the number using Excel date/time formatting as they desire. The downside here is that the timestamp generated by the formula will only be updated when the Outline is restructured or when the database is stopped and started. This is fine for databases only updated during batch. However, it will not satisfy any timestamp requirement on a database that is updated during non-batch hours (assuming you're not restructuring or unloading the database during business hours).

You still might be able to use the substitution variable on cubes updated on an intra-day basis. There is a function, HsGetVariable, in Smart View that will grab a Substitution Variable. Users can put this function on their sheet. When the sheet is refreshed, the value will be updated and they can format it as they wish.

Load a Time Value

The last way I've updated the timestamp is by loading the value obtained in the prior section directly into an intersection in the cube. You will need to load it at a specific intersection, for example: Timestamp, No_Product, No_Store, Jan, Curr_Year, etc. This will be difficult for a user to find so I'd recommend loading the data into a member, say, Timestamp_Load and then reference the intersection in a formula for the Timestamp member such that the value shows up no matter what other members are intersecting with Timestamp. Use cross-dim operator in a BSO member formula or tuple notation in an ASO member formula.


Tuesday, September 22, 2015

Using Ramdisk with ASO

I wanted to see what Essbase ASO performance would be like if not bound by disk. So I thought I'd start out by creating a ramdisk and storing my ess*dat files on it. First of all, ramdisk is just a file system that you create using RAM rather than a hard drive. So it's fast -- turbo-chicken-fast. If you understand what RAM is then you know that when the power is turned off on a machine you lose everything in RAM. Luckily machines and power and such are much more reliable these days so it's not a huge deal to use. I wouldn't, however, recommend going and creating a whole bunch of ramdisk in production and putting all of your cubes on it. You've got to have a good design for it and I'm not in that line of business so I won't get into all of that here.

The Disclaimer

I ran a few very basic tests using ASOsamp. I understand that not all hardware is created equal so my results might not match your results. This test serves as one point of data that will hopefully further our knowledge of how Essbase behaves in different environments. This test was run on 11.1.2.4.002.

The Digression

I run Essbase on Linux. It's very easy to create a ramdisk on Linux. I used the following command:

mount -t tmpfs -o size=1024m tmpfs /mnt/ramdisk

In order to have the database use the ramdisk, I changed the ASOsamp.Sample storage settings in EAS to point the default and temp tablespaces over to the /mnt/ramdisk directory.

The Test

My test data was generated using dbfill. The data I loaded filled the ess00001.dat file to 285MB. I captured the load times. I then ran queries in Excel using the default retrieve, zoom-in on products to bottom level, pivot products to columns, zoom-in on stores to bottom level. I then created aggregate views based upon the Essbase recommendations. I then ran the same queries again. Note: I did not use a compression dimension in the outline.

The Results


The Analysis

These results make a lot of sense to me and illustrate just how cool Aggregate Views are. Here are my thoughts on some of the results:

The Data Load isn't much different because Data Load seems to involve a high use of CPU. Anywhere that Essbase is CPU bound, the ramdisk isn't going help very much. I assume that when Essbase is writing the data that it's also doing a lot of checking to see that members have the correct names and exist in the outline and such.

The buffer commit is much faster under ramdisk because this seems to be a heavy disk use operation. The data is initially written to the /temp directory and then moved over to the /default directory. There probably isn't a lot of data validation going on during this step.

The queries without aggregations are virtually identical. When I watch server utilization during a long running Essbase query it's almost always CPU bound -- the ESSSVR process pegs a CPU at 100% most times. So this makes a lot of sense to me. Having the data file in memory doesn't help Essbase much since it's still got to do a lot of calculating.

The default agg is quicker under ramdisk as it must rely fairly heavily upon reads and writes. Again, it needs to write the Aggregate Views to the /temp directory and then move them over to the /default directory.

The last four queries are where the magic happens. Our query times with the views created effectively drop to instantaneous -- and you don't need super fast disk for this to happen. Aggregate Views are so cool! It only takes 12.55 seconds to create 13 views but that actually saves over 24 seconds on the Store dimension zoom-in. This shows how powerful Aggregate Views can be. We will really need the ability to create these in Hybrid as soon as possible.

The Conclusions

Super fast disk is great to have but it really only makes sense when you have disk bound operations. Costs and benefits need to be weighed as in most design and purchase decisions. The results might be different as the ASO size scales up so I'd encourage others to try to replicate this test with a larger data set. I need to get more RAM on my home server before I can do that.

The Question of the Day

Smart View or Excel Add-In?

Monday, September 21, 2015

Measuring NONEMPTYMEMBER/TUPLE in MDX Part 1

Welcome to MDX Monday. There is a page in the Essbase Tech Ref entitled "MDX Optimization Properties". It describes the use of NONEMTPYMEMBER and NONEMPTYTUPLE in MDX. Supposedly these keywords will speed up MDX queries and member formulas (in ASO) by eliminating the inclusion of missing sets from the formula execution. There is anecdotal evidence that this does speed things up. It seems like it's very difficult to pin down when it does help things. I'm on a mission to figure this out so if you've got examples of it helping, please leave a comment letting me know about it.

I thought I'd start out with the ASOsamp example given in the Tech Ref. I tested out four scenarios.

The Tests

1. Run the example as is, including the NONEMPTYTUPLE statement.
2. Remove the NONEMPTYTUPLE statement.
3. Change the NONEMPTYTUPLE ([Units],[MTD]) to NONEMPTYMEMBER [Units]
4. Change the NONEMPTYTUPLE ([Units],[MTD]) to NONEMPTYMEMBER [MTD]

The query time results are below. Each returned the exact same set.

The Results

1. 0.32s
2. 9.64s
3. 0.21s
4. 9.63s

I got the following warning message for the 4th test. This makes sense because the Sum() function is using [Units] as the value it is summing up.
Warning: Non empty member directive [MTD] for member [3 Month Units] will be ignored since it is from a different dimension.

So the "non-empty directive", as the Essbase application log puts it, does improve performance on this relatively complex MDX statement. It is interesting that NONEMPTYMEMBER is slightly faster (I tested it many times to make sure it wasn't an anomaly) than NONEMPTYTUPLE. I think next I'll try testing this same example in a member formula and see what the differences are.

Question of the day: What was the first version of Essbase you used?



Friday, September 18, 2015

Zen and the Art of Essbase Maintenance

Welcome to another laid back Friday. I’m eager to get the weekend started so let me get right into my ramblings for today.

My son began his formal education at what is known as a Glasser Quality School. I found it very interesting to have discussions about quality with a five year old child. Is this handwriting quality work? Was that a quality sentence you just read? It felt very strange to me considering the fact that I never had such a discussion using the word “quality” until I was in college. As he boarded the school bus in the morning I would yell to him, “do quality work today”. I still say it to both my children on a regular basis. They roll their eyes.

This idea of quality has kept many people up at night, it has even driven one person to have a nervous breakdown. His name is Robert Pirsig. He talks about his descent into mental illness in Zen and the Art of Motorcycle Maintenance. In the book he discusses the idea of quality at great length. The passage that races through my mind again and again even years after I’ve read the book is the one below. Pirsig is looking for a mechanic on his long drive…

Inside I see that Bill is a mechanic of the "photographic mind" school. Everything lying around everywhere. Wrenches, screwdrivers, old parts, old motorcycles, new parts, new motorcycles, sales literature, inner tubes, all scattered so thickly and clutteredly you can’t even see the workbenches under them. I couldn’t work in conditions like this but that’s just because I’m not a photographic-mind mechanic. Bill can probably turn around and put his hand on any tool in this mess without having to think about where it is. I’ve seen mechanics like that. Drive you crazy to watch them, but they get the job done just as well and sometimes faster. Move one tool three inches to the left though, and he’ll have to spend days looking for it.

Am I a developer of the “photographic mind”? Are my scripts scattered so thickly and clutteredly you can’t tell what they’re supposed to do? What if I get hit by a bus – what is the next developer going to be left to clean up? If I’m a consultant, leaving such a mess behind would be shameful. My fiduciary duty to my client is to do quality work. Am I one of those “genius consultants” who designs a system so complex that maintenance is only able to be performed by the one who built it?

In graduate school I was fortunate to be able to take an operations management class. We dissected, in great detail, the case of the Toyota Production System (TPS). There are several principles in this system but the one that has stuck with me over the years is kaizen. Roughly translated from Japanese the word means “improvement”. Within TPS it goes deeper than just improvement. Rather it means “continuous improvement”. You’re never done. There’s always something you can improve upon.

Maintenance is sometimes seen as a boring job. Many people want to work on projects on the cutting edge of technology. Those are exciting but if you don’t properly maintain the existing production systems their performance will eventually degrade and people will start speaking ill of the system you “designed poorly”.

Within the field of Essbase development, there are so many variables. Feeds come in from other systems with new hierarchies and more data than we might have planned on storing. Things can change rapidly during the course of a year. Leaving a cube alone for very long is usually a recipe for failure. We want to know about issues before we have an irate business sponsor on the phone with us. For us, maintenance is required.

I’ll leave you today with another tidbit I picked up in that operations management class: The Eight Dimensions of Quality as written by David Garvin. This puts quality into a more concrete form than Pirsig did. Keep these dimensions in mind as you work. They’re applicable to anything really, just as the discussion of quality is at the heart of anything.

1. Performance
2. Features
3. Reliability
4. Conformance
5. Durability
6. Serviceability
7. Aesthetics
8. Perceived Quality

Now go do quality work today.

Thursday, September 17, 2015

Measuring Performance with ASO Fragmentation

It's true that ASO fragmentation takes place when performing physical partial clears of data. For example, I loaded some data into ASOSamp and the ess00001.dat file for the database is 562,036,736 bytes. If I clear out Prev Year data then the ess00001.dat file increases to 847,249,408 bytes. A restructure of the outline shrinks it back down to 285,212,672 bytes. Note: Only certain restructures will clear up the fragmentation issue. For example, adding a member to a dynamic dimension will but adding a member to a stored dimension will not. Click here to see the various levels of ASO outline restructuring.

What I'd like to do today is measure the retrieval performance associated with fragmentation. I will use the data I mentioned above for four tests. The elapsed time is recorded next to each scenario in parenthesis.

Test 1: Run simple query with no aggregate views. Record elapsed time for each scenario.

Scenario 1: Load just Curr Year data. (0.68s)
Scenario 2: Load data in Curr Year and Prev Year but then clear out Prev Year. (0.68s)

Test 2: Run zoom-in query with no aggregate views. Record elapsed time for each scenario.

Scenario 1: Load just Curr Year data. (3.548s)
Scenario 2: Load data in Curr Year and Prev Year but then clear out Prev Year. (3.56s)

Test 3: Create 50% aggregate views. Record elapsed time for each scenario.

Scenario 1: Load just Curr Year data. (33.71s)
Scenario 2: Load data in Curr Year and Prev Year but then clear out Prev Year. (32.13s)

Test 4: Export data.

Scenario 1: Load just Curr Year data. (19.632s)
Scenario 2: Load data in Curr Year and Prev Year but then clear out Prev Year. (19.714s)

Summary

Well that's not very exciting. But it is good news -- fragmentation doesn't seem to impede performance. I feel like I need to run these tests again with much larger ASO cubes to see what happens. Perhaps with a 20 or 50 GB it might make a difference.

Epilogue

The most interesting result is that in Test 3, 50 views were created which increased the ess00001.dat file to 411,041,792 bytes in Scenario 1 but in Scenario 2 the ess00001.dat file does not increase past 847,249,408 bytes. It seems to be using the empty space in the file to store the aggregate views. I don't think it's any coincidence that the size of the file increases by exactly the size of the size of the Curr Year data when the Prev Year data is cleared. So I'm not sure I'd call ASO fragmentation a true fragmentation. Fragmentation implies that there are lots of many empty spaces all over the files and it seems like data is still contiguous in the ess*.dat file. It would be interesting to hear from an Oracle developer how the clear command impacts the files on disk.

Tuesday, September 15, 2015

ASO Procedural Calcs -- Handling Shared Members

ASO allocations and procedural calculations do not allow you to include any dynamic members in your POV. If you include a shared member you get the following warning:

WARNING - 1300005 - The argument [POV] ignores the duplicate member [100-20].

If the prototype of the shared member is not level 0, it will fail your calculation with the following error:

ERROR - 1300033 - Upper-level members, for example [East], are not allowed in argument [POV]. Select a level-0 member. 

In order to remove any shared members from your POV, you can sometimes use the Filter() function to remove them as illustrated below. I say sometimes because this works in some versions and not in others. I currently have an open SR with Oracle that they are investigating. Hopefully it will be resolved soon. What would be even better is if there were calculation options allowing you to ignore things like dynamic members.

In this example, I'm taking all level 0 members under the Market dimension. But this might include some upper level members that are shared in alternate hierarchies. One strategy would be to only take the descendants of the original hierarchy. Another would be to filter out all shared members as shown here.

 {Filter(Descendants([Market],Levels([Market],0)), NOT Market.CurrentMember.Shared_Flag)}  

Monday, September 14, 2015

MDX Monday -- Order()

Happy MDX Monday. I thought I'd take a few minutes to improve upon the documentation for the Order() function today.

The first issue I see with the documentation is that it's sorting on something not shown in the data set. It makes it very difficult to see if the function is working correctly. The document states that the "display of products is sorted based on ascending Actual Sales in Qtr1". Let's check if that's true by adding Actual Sales for Qtr1 to the query. I will do this by adding another CrossJoin to the column selection to add the Qtr1 and Qtr2 members. While I do that I need to drop the where clause which filtered on Qtr2. Finally, I will add Actual to the Scenario set.

1:  SELECT  
2:  CrossJoin({[Year].[Qtr1],[Year].[Qtr2]},  
3:    CrossJoin(  
4:        {[Scenario].[Actual],[Scenario].[Budget]},   
5:        {[Measures].[Marketing], [Measures].[Sales]}  
6:      )  
7:    )  
8:  ON COLUMNS,  
9:  Order(  
10:      [Product].Levels(0).Members,  
11:      ([Year].[Qtr1], [Scenario].[Actual])  
12:     )   
13:  ON ROWS  
14:  FROM Sample.Basic;  

This gives me the following output:




This is clearly not sorted on Qtr1 Actual Sales. My hunch is that they've left Sales out of the tuple in the second argument in the Order() function. If I add that in I have:
1:  SELECT  
2:  CrossJoin({[Year].[Qtr1],[Year].[Qtr2]},  
3:    CrossJoin(  
4:        {[Scenario].[Actual],[Scenario].[Budget]},  
5:        {[Measures].[Marketing], [Measures].[Sales]}  
6:      )  
7:    )  
8:  ON COLUMNS,  
9:  Order(  
10:      [Product].Levels(0).Members,  
11:      ([Year].[Qtr1], [Scenario].[Actual], [Measures].[Sales])  
12:    )  
13:  ON ROWS  
14:  FROM Sample.Basic;  

This give me the following output:


Now we're getting somewhere. I'm guessing that originally the query was just using the member Measures to do the sorting. Let's try it:
1:  SELECT  
2:  CrossJoin({[Year].[Qtr1],[Year].[Qtr2]},  
3:    CrossJoin(  
4:        {[Scenario].[Actual],[Scenario].[Budget]},  
5:        {[Measures].[Marketing], [Measures].[Sales], [Measures]}  
6:      )  
7:    )  
8:  ON COLUMNS,  
9:  Order(  
10:      [Product].Levels(0).Members,  
11:      ([Year].[Qtr1], [Scenario].[Actual])  
12:    )  
13:  ON ROWS  
14:  FROM Sample.Basic;  


This gives me the following output:

It looks like my hypothesis was correct.

Now that we're talking about Order(), let me give you one bonus tip. You can sort a report based upon member names or aliases. I just need to use the CurrentMember() function and the desired property.

This query will order my report based upon Product Member names.
1:  SELECT  
2:    {[Measures].[Sales]}  
3:  ON COLUMNS,  
4:  Order(  
5:      [Product].Levels(0).Members,  
6:      [Product].CurrentMember.[MEMBER_NAME]  
7:    )  
8:  ON ROWS  
9:  FROM Sample.Basic  
10:  WHERE ([Year].[Jan],[Scenario].[Actual]);  

The output:



This query will sort based upon Product Aliases.

1:  SELECT  
2:    {[Measures].[Sales]}  
3:  ON COLUMNS,  
4:  Order(  
5:      [Product].Levels(0).Members,  
6:      [Product].CurrentMember.[MEMBER_ALIAS]  
7:    )  
8:  ON ROWS  
9:  FROM Sample.Basic  
10:  WHERE ([Year].[Jan],[Scenario].[Actual]);  

The output:


You can also order based upon other properties such as level and generation number. Happy sorting!

Friday, September 11, 2015

Sharpen the Saw

Welcome to laid back Friday on the heels of a holiday shortened work week here in the US. Summer has come to a screeching halt in the Chicago area. Kids are back in school so I thought I'd write a bit about education today.

If you happen to be one of those EPM developers who is uncomfortable with Unix/Linux I'd like to encourage you to take the free codecademy course on using the command line. This is a great opportunity to improve your skill set in just a few hours. If you have a job opportunity in a Linux shop, this will certainly give you an advantage over someone with only Windows experience. The use of Linux in corporate environments continues to rise with no sign of slowing down. I'd also recommend learning the vi editor so you can easily edit files. And if you want to be a Linux ninja, learn Python or Perl. Happy training!

Thursday, September 10, 2015

CALCPARALLEL with ASO

If you perform aggregations on your ASO cubes and you have enough resources on your server to allocate more than one processor then, by all means, change the CALCPARALLEL setting in your essbase.cfg file to a number higher than the default (2). Be sure to heed the recommendation to increase your ASO cache as each thread you allocate to the aggregation requires its own amount of cache.

If you were to go by the DBA Guidelines for cache (32MB of cache for each 2GB of input data) you would need 64MB of cache if you wanted to use the default number of threads when aggregating a 2GB ASO database.

Let's see how this translates to the real world.

I have an ASO cube with 9,999,840 KB of input level data. That should require, well, let's do the math:

First figure out how many times larger my input data is than 2GB: 9,999,840/2,097,152 = 4.768...

Now just take the square root of 4.768 = 2.183...

Now multiply that by 32MB = 69.876...

Hang on, if you read further down the page you see:

To improve the performance of building aggregates, take the following steps.
  1. Increase the size of the aggregate storage cache to at least 512 MB or 20% of the input data size, whichever is smaller.
All that math for nothing. So I should set it to 512MB or 1,999MB (20% of my input size), whichever is smaller. That's easy. 512MB < 1,999MB.

<scratching head>So do I need to allocate 512MB per thread or is that in total? My guess is that's in total.

Once again, Let's see how this translates to the real world.


Let's just test it out and see the performance based upon a bunch of different settings. My CALCPARALLEL is set to 16 in this case.

Here's the MaxL script I'll use:

1:  alter database TEST.TEST clear aggregates;  
2:  alter application TEST set cache_size 32MB;  
3:  alter system unload application TEST;  
4:  execute aggregate process on database TEST.TEST stopping when total_size exceeds 1.1;  


The Results

The table shows the cache size and respective aggregation process time. Each time 13 views were created. As you can see on the graph, there is a diminishing return after about 164 MB. That's well below the recommended 20% or 512MB guideline.

Cache Size (MB) Aggregation Process Time (s)
32 753.78
64 383.53
70 334.61
100 283.98
128 257.95
164 241.38
200 236.79
256 233.31
512 234.8
1024 235.01
2000 228.21
Graph: Time in seconds is on the y-axis, cache size in MB is on the x-axis.

Wednesday, September 9, 2015

ASO Allocations -- Performing a Simple Allocation

If you need to translate BSO calc scripts over to ASO procedural calcs/allocations, the Oracle documentation isn't a lot of help. There is no one for one translation table and the Execute Allocation command has so many options that it's difficult to know where to begin. It does take a while to read through and understand. If you've struggled through it before, hopefully this post will help. I'm going to take a simple BSO allocation and translate it for you, explaining things along the way.

Let's assume we have an outline that looks like the one below. This is based off of Sample.Basic, with a few minor adjustments.

Assume we had an allocation from our old Sample Basic cube which was taking last year's Marketing spend and spreading it to this year's Marking budget based upon last year's sales by market. You can see how the data looks below. In this case the $843 gets spread to the various markets in direct proportion to what the market's sales were.


The Old Way

The BSO calc script to do this could use the @ALLOCATE function as below.
1:  FIX("Marketing")  
2:     "Budget" = @ALLOCATE("PY Actual, @LEVMBRS("Market",0),"PY Actual"->"Sales",,share);  
3:  ENDFIX  

Translating to ASO

The most basic difference between a BSO Calc Script and an ASO Allocation is that the former is a file that can be executed in a number of ways: via MaxL, through Smart View, through EAS, etc. and the latter is a MaxL statement. This is the process for creating my MaxL statement.

The first thing I need to do is to start my execute statement specifying the database name.
1:  Execute Allocation Process on Database ASample.Basic with  

Next I need to define my POV. I use the CrossJoin function to define the Range the allocation will be run on. In this case all Level 0 products and all 12 months.
2:  POV "CrossJoin([Jan]:[Dec],  
3:  {Descendants([Total Products],Levels([Product],0))})"  

Now I need to set the AMOUNT I am using in the allocation. In this case it is the Marketing dollars from last year for all Markets. That can be represented using a tuple.
4:  AMOUNT "([Market],[PY Actual],[Marketing])"  

Then I will set the BASIS by which the data will be spread. In this case the PY Actuals, Sales tuple.
5:  BASIS "([PY Actual],[Sales])"  

I will define my TARGET with a tuple pulling data from the intersection of the Budget Scenario and Marketing Measure.
6:  TARGET "([Budget],[Marketing])"  

Next I will define my RANGE as a set made up of the Level 0 members under Total Markets. This is the group of members to which the data will be allocated.
7:  RANGE "{Descendants([Total Markets],Levels([Market],0))}"

Finally I will specify the Allocation method. In this case I will use SHARE. You can use SPREAD if you want to allocate the data values evenly. If you do use SPREAD then make sure to drop the BASIS (simply remove line 5 from the statement) as it doesn't make sense to use a BASIS when allocating evenly.
8:  SHARE;  

Hopefully when you execute, you'll see the following messages telling you the elapsed time and how many cells were created.

 OK/INFO - 1300006 - Essbase generated [5] cells.
 OK/INFO - 1013374 - The elapsed time of the allocation is [0.13] seconds.
 OK/INFO - 1241188 - ASO Allocation Completed on Database ['ASample'.'Basic'].

Just a Start

There are many other options available for use in more complex allocations. This example will get you started. I'll work on some posts involving more complicated allocations in the future.

Tuesday, September 8, 2015

ASO Restructure Performance Part 2

As I showed in Part 1, you don't have a lot of control in speeding up ASO restructures. As with BSO, the more data you have in the cube, the longer the restructure will take. One tip for faster ASO restructures is to clear aggregate data. Once the restructure is complete you'll need to aggregate the data again but you can do this while the cube is live if you need to. Keeping this in mind, I thought I'd compare the performance for a few restructure options.

1.  With Data and Aggregations.
2.  With Input Data Only.
3.  Export, Restructure, Load.


For my test model I'm using ASOsamp.Sample with 550MB of input data and when I aggregate I'm using the below command. This will give me aggregate views approximately 50% the size of the input data. To cause the restructure, I will add a member to the Measures dimension.

Execute aggregate process on database ASOsamp.Sample stopping when total_size exceeds 1.5;

Now the results (all times in seconds):

1. With input data and aggregations.

Restructure Time: 51.85s
Total Elapsed Time: 51.85s

2. With input data only.

Clear Aggregate Data Time: ~1s
Restructure Time: 26.12s
Create Aggregate Views Time: 48.71s
Total Elapsed Time: 75.83s

3. Export/Restructure/Load

Export Data Time: 43s
Clear Data Time: ~1s
Restructure Time: 1.35s
Load Data Time: 13.41s + 14.88s
Create Aggregate Views Time: 52.29s
Total Elapsed Time: 125.93s

I do find it interesting that having aggregated views that are 50% the size of the input data causes a nearly 100% increase in restructure time (26.12s vs. 51.85s). I think I need to do some more experimentation on this aspect of restructures.

In this example I wasn't able to gain better performance by restructuring with less data. I did run a few tests turning on parallel loads, calcs and exports. The results were slightly better (method 2 clocked in at about 66s) but the restructure with data and aggregations still won the day. I think there is reason to believe that with some larger input sets and a very powerful machine that method 2 could beat method 1.

Friday, September 4, 2015

Classic Design -- The Essbase Excel Add-In

What's better than a laid back Friday just before a long holiday weekend? Maybe a philosophical discussion on the nature of design.

High quality designs tend to last a long time. I'm a fan of documentaries and a couple of design related films come to mind: Helvetica and Welcome to Macintosh. These films talk about two things that have incredible designs -- one a font and the other a machine. Both have lasted quite a while and are very popular. Macintosh sprang from design work done at Xerox PARC. They essentially invented GUI but couldn't bring it to market at a low enough cost. Ideas are great but execution is the key to success.

Some designs are beautiful. I'm awestruck whenever I enter a Frank Lloyd Wright designed home. Each one I've seen speaks to me as a manifesto on quality design. Attention was paid to every detail from the view you have of the building as you approach all the way down to the lampshades and electrical sockets.

Some designs are ugly, or at least they seem ugly. Put someone in front of a Unix terminal showing the command line and they might cringe. Put me in front of a Unix terminal and I'm at ease. This operating system is older than I am and is essentially unchanged. Sure it has been improved over time, it handles 64-bit hardware now and you can put a GUI in front of it if you really need to, but at it's heart it is the same. A beautiful, long lasting design, held in high esteem by those willing to take the time to learn the power available. The speed of getting things accomplished is amazing when one doesn't need to ever lift their hands from the keyboard to fidget with a mouse.

This brings me to another lasting design -- the Essbase Excel Add-In. This interface is essentially unchanged since I began my Essbase journey. Yes, it has moved on to be called Smart View, but it uses the same principles.

I recall the first Essbase training class I ever taught. I had a lab with about 30 students in it. Essbase was completely new at this company and I'd be teaching people totally foreign to the product. I still remember the sound when people discovered that they could double left-click to zoom-in and double right-click to zoom-out. The room went nearly silent. All I could hear were hundreds of mouse clicks. There was no talking, just clicking. These people had just been given an incredible tool that would change the way they do their jobs. It would improve their productivity.

Here we are twenty years after it was designed and my users do not even want to try Smart View because it means they would have to part with their beloved, beautiful, simply designed Essbase Excel Add-In.



Thursday, September 3, 2015

ASO Test Data or: How I Learned to Stop Worrying and Create Hybrid

I wanted to fill ASOSamp.Sample with some data so I could run a few tests. I wanted to use the completely undocumented "filldb" command within ESSCMDQ in order to do so. Unfortunately, it only works on BSO. So I needed to turn ASOSamp into a BSO database first. There's no way to do that, right? Wrong. A very wise Essbase developer points out in this OTN post that one need only run the Aggregate Store Partition Wizard available within EAS in order to do so.

Let's Recap:



  1. Run the Aggregate Store Partition Wizard to make a BSO copy of your ASO cube. It may give you some warnings so I suggest first making a copy of that ASO cube and using the wizard on the copy. If you get warnings, go into the ASO outline and fix the issues. You'll need to remove things like solve order and other ASO specific items.
  2. If you only need to turn the BSO cube into a Hybrid cube, you can use Celvin Kattookaran's utility available here that changes all upper level sparse members to dynamic calc members. Stop here if you don't need a test bed of data.
    1. Don't forget to set ASODYNAMICAGGINBSO in your essbase.cfg file.
  3. Download and unzip ESSCMDQ. It's probably easiest to put this file in the same place where ESSCMD or essmsh already run so you don't need to fiddle with paths, libraries and such.
  4. Run ESSCMDQ. Login as you would ESSCMD (no semi-colons!):
    1. login hostname user password
  5. Select the database you need to fill with data:
    1. select appname dbname
  6. Run the filldb command. See the link for more information on this command. I would suggest starting out with a small size. I had a very small block size in my cube and it went way past the 1 GB mark even though that's what I set it to. With a normal size block it was close (within 10%) to 1 GB.
    1. filldb [size density distance]
  7. Export the data from the BSO cube.
  8. Import the data into the ASO cube.
Now I can easily create a test bed of data for ASOSamp at whatever density I require. I'll be using this in future posts.

Wednesday, September 2, 2015

Which Is Faster -- a Substitution Variable or a Data Value?

The other day on Network 54 someone asked how they could get substitution variables updated in a member formula without unloading and reloading the database. Well, it's not possible. Updates to substitution variables are immediate for many things but member formulas in an outline are not one of them. DanP chimed in with another option that I had completely forgotten about: load the substitution variable value into a database intersection. Brilliant!

Which got me to thinking, which performs better in an ASO member formula -- a sub var or a data value?

I've Been Here Before

I've worked for several financial institutions over the years and they often have rigid SOX requirements. One common need is for the user to have easy access to some time stamp telling them how "fresh" the data is. This can be accomplished in a number of ways -- a member alias set to a time stamp, a sub var and a member formula or a data value loaded directly into the database.

The first one doesn't require any special tricks because a member alias is in plain text. The second two require setting the sub var or data value to the Epoch (number of seconds elapsed since 1/1/1970). If you can get that number into Excel, you just need to apply a date format. You can even use the Smart View HsGetValue() function to grab the member value and format it as you desire in the sheet header or footer.

The catch is that this gets trickier to do if the data is updated during daily loads. You've got to stop and start the database (or run a restructure) in order to get that sub var refreshed. That's no good if you've got a heavily used system. Users don't like to get kicked out while they're reporting. So you only need to update the data value and have the member formula reference the intersection you're loading to. I would go into more detail on this solution now but this post really isn't about showing users a time stamp (let me know if there's interest and I'll do a full write-up on the subject). Anyway...

So Which is Faster?


If I'm designing purely for performance, which do I choose? Either. I tried performing very large queries that required the formula to be queried many times and they were nearly equivalent. In MDX on the server they were exactly equivalent over 10 runs. In large pulls through Smart View they were both in the same range -- between 2.5 and 3.1 seconds over the course of 10 queries each.

Tuesday, September 1, 2015

ASO Restructure Performance

According to the Essbase Tech Ref, the RESTRUCTURETHREADS setting does not apply to ASO. But if I watch the CPU usage of the ESSSVR process on Linux during ASO restructures it often goes above 100% -- I've seen it over 1000% which means that 10 cores should be working to capacity. So I'm going to test a simple question today. Does setting RESTRUCTURETHREADS above 1 improve ASO restructure performance on machines with more than one core?

This is a very straightforward test. I've got an ASO cube with about 1GB of input level data.

The Test


  1. Make a copy of the database.
  2. Validate that there is no RESTRUCTURETHREADS setting in the essbase.cfg file.
  3. Add a member to the accounts dimension and watch the CPU usage during the restructure.
  4. Record the elapsed time of the restructure.
  5. Validate that RESTRUCTURETHREADS is set to 4 in the essbase.cfg file.
  6. Restart the Essbase server.
  7. Add a member to the accounts dimension and watch the CPU usage during the restructure.
  8. Record the elapsed time of the restructure.

The Results

Elapsed time without RESTRUCTURETHREADS setting: 70.97s
Elapsed time with RESTRUCTURETHREADS 4 setting: 68.59s

That's close enough to call it even. I watched the CPU usage during the process and it was similar each time. It spiked to 200% for a good portion of the restructure. So what's going on? I'm not exactly sure but it looks like ASO restructures are multi-threaded and there's no setting to control how many threads to give the process. I can't find any documentation about it. If you do, let me know.