Surrogate key issues with Analysis Services

Usually I don’t post something just to raise a question – but every rule has its exceptions.

As you know, using surrogate keys is a best practice for a lot of reasons. Everything works fine with SSAS when you use surrogate keys with a plain star schema. If you have an incremental process of the dimensions in the Data Warehouse, you are also granted that surrogate keys don’t change their internal meaning during time. But, are you sure you want to use surrogate keys in end user reporting tool?

The problem arises when you have a daily reprocess of the whole OLAP database and of the whole relational data mart (or data warehouse, if you prefer). A complete reprocess is not so strange if you have less than 10 GB of data. I’ve seen the same technique applied to data mart with tens of millions of rows in the fact table and 20/30 GB of data (the process can be 3/4 hours each night).

Each dimension that makes the granularity attribute visible to end users, also exposes the surrogate key. Even if you change the Name property of the attribute, its Key still maintains the surrogate key. When the end user creates a pivot table with Excel 2007 or a report with ProClarity, he never see the surrogate key, but he save a document that contains that key and that will use this key to query the cube next time.

The user might open a query saved some days/months ago and… He doesn’t see the same members, just because the same surrogate keys got a different meaning. If surrogate key are generated each night, this happens daily. Even with an incremental data mart you might have the need to reprocess (on the relational side) one or more dimensions in the future.

Now, the issue is both practical and philosophical. From the practical point of view, the user rants he lost his reports. From the philosophicalside, when you save a surrogate key in a report, you are giving a semantic value to something that probably shouldn’t have one.

With SSAS and pivot tables this is a practical issue that may be approached in several ways. I’m studying what is the better balance between usability, maintainability and performance. But I’d like to get some comments about this issue that probably is very common. I did some search, but I haven’t found a good discussion focused on the issues related to Analysis Services client tools.

Give me your feedback!