Monday, August 17, 2015

Moving data from ASO to BSO

Moving data from ASO to BSO seems like taking a step backward to me. But as a part of testing Hybrid Essbase I want to compare it with ASO. To do this I need to create a BSO cube and move data from the ASO cube with which I'm comparing it. This seems really simple -- just export out of the ASO cube and import into the BSO cube. Unfortunately there's more to the story.

I'd love to just set up a replicated partition from ASO to BSO but that's not possible. The tests:

1. Baseline -- Load data into BSO using BSO generated export
First I loaded data into my BSO cube and exported the file. Then I cleared and loaded that file into the cube. Theoretically this should be a completely fragment free cube and should be the fastest load possible.
Elapsed Time: 16.19 seconds
Page file size: 25,215,057 bytes

2. Load data into BSO using ASO generated export (no compression dim)
Next I exported the data from ASO and imported it into the BSO cube. This is dog slow. It took almost 15x longer and 5x the disk space. This is probably the least optimal way of loading data into BSO since it was loading each block into memory and writing to disk for just about every number that got loaded.
Elapsed Time: 238.84 seconds
Page file size: 127,402,065 bytes

3. Load data into BSO using ASO generated export (compression dim is Time)
Next I set the compression dimension for the ASO cube to Time. This changes the way the data export is written to disk -- the compression dimension members will be in columns. This helped a bit because now at least the Time members are loaded and written to a block at the same time since Time is dense in the BSO cube.
Elapsed Time: 58.72 seconds
Page file size: 66,355,281 bytes

4. Load data into BSO using ASO MDX export
Finally, I exported the ASO data using an MDX script. This allows you to get the columns in the right order (largest sparse to smallest sparse then dense) so that each block will be written only once in the target BSO cube. This test is a bit more complex. First of all, the export didn't work in version 11.1.2.2 -- the size of the CrossJoin() was too large.

ERROR - 1200712 - Internal error: Query is allocating too large memory ( > 4GB) and cannot be executed. Query allocation exceeds allocation limits.

The good news is that it worked like a champ in 11.1.2.4.002. This is interesting because the MDX limits listed in the 11.1.2.2, 11.1.2.3 and 11.1.2.4 Tech Refs appear to be all the same. The bad news is that MDX is still exported in a very messy way. I wrote a Perl script to format the data -- more about that here. Now we're back to the same results as the pure BSO export.
Note: the MDX export was about 8 seconds slower than the ASO export. Also, the file conversion in Perl took about 9 seconds.
Elapsed Time: 16.38 seconds (plus 8 seconds for the export and 9 seconds for conversion of the export file)
Page file size: 25,215,057 bytes

Conclusion:
Don't blindly export an ASO cube and import the file into a BSO cube. At least try to set your compression dimension to be the same as the most dense dimension in the BSO cube. If you can, export using MDX.

No comments: