The wonderful processing options of Analysis Services 2005

Today I deeply tested processing options of Analysis Services 2005. A complete discussion of it is available on this paper by T. K. Anand, so I will not describe what is already well written but I’d like to share my today experience.

What is more interesting to me is the handling of changing dimensions. Imagine you have a Customer dimension with a changing attribute of type I (new values of attribute overwrite the previous in the same record); this attribute can have a key and a name, let’s say they are CategoryCode and CategoryName. I’ll use the format [CustomerName CategoryCode CategoryName] to make samples of customer records.

First scenario: the CategoryName for a particular CategoryCode changes (for example, [Mark PRG Programmer] becomes [Mark PRG Developer]), so all Customer records change according to this. In this case, the Process Update of Customer dimension will change the Category in the dimension and you don’t need to update cube to see correct values (only the displayed name changed).

Second scenario: the CategoryCode of a Category changes for a particular Customer and related CategoryName changes accordingly to it (for example, [Mark PRG Programmer] becomes [Mark MUS Musician]). In this case, the Process Update of Customer dimension will change the Category for Mark dimension member and data in cubes are updated even if you don’t process the cube! If you don’t have aggregation in the cube, it is not a surprise, but if you have aggregations on Category attribute than It’s a kind of magic (especially if you are used to Analysis Services 2000) because it works in both cases! What’s happening? Well, if there is an aggregation on an attribute that is changed during a Process Update, that aggregation is temporarily removed from the cube, while it’s still present in the cube schema. At this point cube queries give correct results and aggregation not affected by dimension process update are still in use. If you want to rebuild the lost aggregation you don’t have to reprocess the whole cube/measure group/partition but you only need to do a Process Index on it. Only the necessary missing indexes are generated, already existing indexes are maintained. If you are not sure what kind of Process you need on a cube, you can use the Process Default, that use the less expensive process type to give you the right data at the minimum cost.

I stressed this scenario even with related dimensions and it works very well. I really like this approach, even if I’d like to know in advance what is the necessary process step for a “process default” (but may be I missed the right documentaton page… comments are welcome!).