Wednesday, May 13, 2020

Federated Partitions in Essbase

In this post I'm going to try to explain what a Federated Partition is in the Essbase world, how to create one and then I'll throw in some speculation about them just for fun.

So what exactly is a Federated Partition? I can find no documentation on them (with the exception of the RealtimeCSV_Updates.xlsx file in the Cloud Gallery). They were briefly presented by Oracle at Kscope19. My only understanding comes from trying to set them up and speaking directly with Oracle product management.

Very simply put they are transparent partitions to either relational databases or flat files. The data is not stored in Essbase, it remains in the source. "Realtime Partitions" is how they were presented at Kscope19.

This architecture has a number of ramifications I'm sure. The word "fast" does not come to mind but the jury is still out.

Now I'll walk through creating one in Essbase 19c. I tried getting this to work in OAC back when OAC meant Essbase Cloud but ran into many issues and was not successful. This is going to be a very simple example using a flat file source. When the data in that flat file changes, the data in Essbase will change -- like magic. Yeah, mind blown, I know, let's get started...


Create the Cube and Partition 

Essbase 19c comes with a gallery of applications well beyond Sample Basic. The one designed to demonstrate Federated Partitions is called RealtimeCSV.


Import the cube from the Gallery.
image

Select the RealtimeCSV_Updates.xlsx file which contains everything Essbase needs to build the cube.
image

Now I can see cube in my list of applications.
image

Under Sources, I select Create Connection and File.
image

I named my connection RealTimeCSV_Conn and selected the Realtime_DS.csv file from the shared folder on the cloud. Since you cannot update the files in the Gallery, I made a copy of the Realtime_DS.csv file which will allow me to update it.
image

Next I selected Datasources and Create Datasource. Then I chose the Connection I just created.
image

I named the Datasource RealtimeCSV_Conn and clicked Next.
image

I changed the Units and Price column to have the Double type. Then clicked Next.
image

Finally I was prompted with a Preview of my data and I clicked Create.
image

Next I clicked on the Areas tab. From there I clicked the Add Area button and entered my Target Area using Essbase functions and member names.
I then clicked the Validate button followed by the Save and Close button.



I then went to pull in some data to test the cube. The data is coming into this cube in real time.
image

Finally, I went back and updated the csv file with some different numbers. I refreshed my data and the changes were reflected.
image


Just to prove what's going on here, I'll show you the database storage statistics. You can see that there are no blocks and no pag or ind files.


Speculating Wildly

So what does this all mean? Why would Oracle be spending time on a feature like this? All of this is speculation influenced by comments I've heard from product managers over the past year...

Oracle is a database company. They make a really powerful database that runs a significant portion of the world. Wouldn't it be really cool if they could scrap page and index files altogether and just have their relational database store all of the level 0 Essbase data? Well that's what this is. Right now. Slap hybrid calculations on top and you have real-time updates to your cube. No loads, no calcs.

It always comes down to performance. This cube is three dimensions. It's got training wheels and makes Sample Basic look complicated. I'll remain skeptical, outside of small use cases, until I see large cubes using Federated Partitions with sizable data sources. In the meantime it's another tool in the toolbox. Drop me a line if you start using it in a production environment.


Disclaimer

I urge extreme caution in trying to use Federated Partitions. Keep in mind that Oracle has probably not documented them for a reason. For now I would recommend using traditional techniques for loading data into a cube.

Thursday, January 9, 2020

AUTOMERGE

I started writing this post four and one half years ago. I'm finally getting around to finishing it because I had to use it in real life this week.

I ran into what I thought was strange behavior in ASO the other day. I'd submit data to a cell, then submit a zero. I'd see a zero. Then I'd submit data to a cell, then submit a zero. I'd see a #MISSING. I'd repeat this over and over again. The first time I'd get a zero, the second time I'd get a #MISSING. At least that was consistent!

I had a hunch it had to do with ASO data slices and the merge process. In 11.1.2.4 the default ASO behavior is for slices to be automatically merged after some number of slices have been created. The two essbase.cfg settings that control this are AUTOMERGE and AUTOMERGEMAXSLICENUMBER.

So the first send of a number created slice 1, the send of the zero created slice 2, the send of the next number created slice 3, the send of the next zero created slice 4. Four happens to be the default value for the AUTOMERGEMAXSLICENUMBER setting. ALWAYS happens to be the default value for the AUTOMERGE setting. So when that fourth slice was created, it triggered Essbase to automatically merge the slices back together.

The documentation states that the merge happens after the slice number is exceeded, so it should happen on the fifth load of data. I observed that it was happening on the exact number, so on the fourth load of data.

The one thing you can't control here is that when the merge happens it will turn your zeros into #MISSINGs. In most cases this is fine, but in certain circumstances when you need an actual zero, it could be a problem. Merging the data slices using MaxL allows you to control whether or not the zeros are removed.

alter database ASOsamp.Sample merge all data remove_zero_cells;
I'm curious to know if anyone has run into any problems with this setting causing performance problems. Depending upon how many users are sending data to the cube I would imagine running so many merges could be an issue. Then again, it might help since you reduce query cost after merging slices.

Friday, September 13, 2019

MDX Insert Intermission

I have not created any posts about MDX Insert lately as I have hit a snag with my research on the CUSTOMCALCANDALLOCTHRUINSERT Essbase config file setting. This setting, in theory, will take an existing ASO procedural calc and automatically run it as an MDX Insert. This would be great for the customers who have a large number of ASO procedural calcs. I get the impression that once Essbase 19c comes out that this will work. So I'll wait until I have better information before doing a full write-up on the topic.

In the meantime I can report some really great news about MDX Insert and my hope for it becoming the next "big thing" for Essbase.

I was working with a colleague on an ASO procedural calc. He was using 11.1.2.3.500 and ran into the "query exceeds 2^64" error which kills the calc. This was due to the large number of combinations in the POV.

The same calc in 11.1.2.4.031 doesn't give an error but it also never finishes. I think a lot of people were turned off to ASO procedural calcs due to this issue.

The great news is that the same calc that died previously, when translated to MDX Insert, took a grand total of two seconds to complete in OAC. Two frikkin' seconds!

I'm seeking other outlines to try this on. If you've got one I can use for testing, drop me a line.

Also, I'll be at Oracle Open World 19 in San Francisco next week. I hope to see some of you there.

Tuesday, August 6, 2019

MDX Insert Part Three



In Part Two of this series I showed how to perform a simple MDX Insert, copying a single cell of data in an ASO cube. In this post I will show how to create a slightly more complicated calculation.

Keep in mind that you need to be using Essbase in OAC for this to work. It will not work in Essbase 11.1.2.4 or prior.

WITH

I'm going to use the MDX 'With' statement in order to create a calculated member. The With statement comes first in an MDX script. You can use it to define an Essbase member on the fly. You can then use that member in other calculations or in the output of an MDX query. To keep things simple and easy to reproduce for those of you following along at home, I will recreate the Variance member in ASOSamp.Basic.

First I added a member in the outline called Stored Variance.



Next I created my MDX statement. The 'With' statement comes first. In this case I called my member on the fly _Variance. I put an underscore in front of members on the fly so they won't conflict with any Essbase members and so that I can quickly see what's being created dynamically in MDX. Of course, if you happen to use underscores to begin member names in your Essbase outline (heaven help you and) find a different method.

The syntax here is hopefully straightforward. You define the member name AS some formula which goes in single quotes. Here I am subtracting [Prev Year] from [Curr Year].





Next comes the INSERT statement. Remember from part two that the syntax is "(source tuple)" TO "(target tuple)".



Then the SELECT statement is added surrounded by parenthesis. In this case I'm putting the _Variance member on the columns and then the Crossjoin of all the level zero members from the remaining dimensions on the rows. Notice that I don't need to bother leaving out shared or dynamic members.



Once I validate and save the script, I can run the job. To do that I go to the jobs screen in the Jet UI.




Then select New Job -> Run MDX.




Finally, I select the MDX script I want to run.



This calculation took 6.9 seconds to run with the standard ASOSamp.Basic data.


Why?

Why might I want to run a calculation like this? In this case it doesn't make a lot of sense. ASO should be able to handle a simple variance calculation with little effort. But what about a rate times volume calculation? Or a foreign currency calculation? Those get much trickier -- see why here.

The Old Way

You could run this calculation with the Execute Allocation MaxL statement. Here's the statement:
 execute allocation process on database ASOSamp.Basic with  
 pov "CROSSJOIN(Except([Measures].LEVELS(0).MEMBERS,{[Avg Units/Transaction],[% of Total]}),  
     CROSSJOIN(Descendants([MTD],LEVELS([Time],0)),    
     CROSSJOIN([Transaction Type].LEVELS(0).MEMBERS,   
     CROSSJOIN([Payment Type].LEVELS(0).MEMBERS,   
     CROSSJOIN([Promotions].LEVELS(0).MEMBERS,   
     CROSSJOIN([Age].LEVELS(0).MEMBERS,   
     CROSSJOIN([Income Level].LEVELS(0).MEMBERS,   
     CROSSJOIN([Products].LEVELS(0).MEMBERS,   
     CROSSJOIN([Stores].LEVELS(0).MEMBERS,[Geography].LEVELS(0).MEMBERS)))))))))"  
 amount "([Curr Year]-[Prev Year])"  
 target ""  
 range "{[Stored Variance]}"  
 spread;  

Notice that I had to add in an Except() statement to remove members with formulas. I also had to change the way I was pulling the Time dimension members because they too had some formulas. When I ran the script I got these warnings:

 [DBNAME: Basic] The argument [POV] ignores the duplicate member [Flat Panel]  
 [DBNAME: Basic] The argument [POV] ignores the duplicate member [HDTV]  
 [DBNAME: Basic] The argument [POV] ignores the duplicate member [Digital Recorders]  
 [DBNAME: Basic] The argument [POV] ignores the duplicate member [Notebooks]  

So I probably should have removed those shared members from the POV as well. What a pain...

This script took 7.9 seconds to run. I understand that 1 second is not a lot of time but in this case the MDX Insert was about 15% faster than the Execute Allocation. I've always seen performance improvement for MDX Insert when comparing the two methods. Over the course of many longer calculations, the performance improvement can be significant.



NON EMPTY

In case you're wondering, I did try adding a NON EMPTY directive before the rows set and it yielded no performance improvement.
 

Appendix

MDX Script for easy copy and paste:
 WITH  
 MEMBER Years.[_Variance] AS  
   '[Curr Year]-[Prev Year]'  
 INSERT  
   "(Years.[_Variance])" TO "(Years.[Stored Variance])"  
 INTO ASOSamp.Basic FROM   
 (  
   SELECT  
    {Years.[_Variance]} ON COLUMNS,  
    {CROSSJOIN([Measures].LEVELS(0).MEMBERS,  
     CROSSJOIN([Time].LEVELS(0).MEMBERS,    
     CROSSJOIN([Transaction Type].LEVELS(0).MEMBERS,   
     CROSSJOIN([Payment Type].LEVELS(0).MEMBERS,   
     CROSSJOIN([Promotions].LEVELS(0).MEMBERS,   
     CROSSJOIN([Age].LEVELS(0).MEMBERS,   
     CROSSJOIN([Income Level].LEVELS(0).MEMBERS,   
     CROSSJOIN([Products].LEVELS(0).MEMBERS,   
     CROSSJOIN([Stores].LEVELS(0).MEMBERS,[Geography].LEVELS(0).MEMBERS)))))))))} ON ROWS  
   FROM ASOSamp.Basic  
 )