Optimize memory in #powerpivot and #ssas #tabular

Microsoft published an interesting article about how to optimize memory consumption of a PowerPivot Data Model in Excel 2013. All these suggestions are also valid for SSAS Tabular. I also wrote an article Checklist for Memory Optimizations in PowerPivot and Tabular Models with a summary of the best practices.

The short list of things to do is very valuable:

  • Removing columns non necessary for analysis
    • Identity column (PK) of a fact table
    • Timestamps, guid and other info useful for auditing and replication, but with no data for analysis
  • If a column has too many distinct value and cannot be removed (i.e. transaction ID in a fact table for drillthrough), consider splitting the column into multiple distinct parts.
    • Each one of the parts will have a small number of unique values, and the combined total will be smaller than the original unified column.
    • Always separate date and time in two columns, instead of the original datetime.
    • In many cases, you also need the distinct parts to use as slicers in your reports. When appropriate, you can create hierarchies from parts like Hours, Minutes, and Seconds.
    • Keep only the granularity you really need.
  • Normalize columns keeping only those with the lower number of distinct values
    • For example, if you have quantity, price and total line amount, import quantity and price and calculate total line amount as SUMX( Sales, Sales[quantity] * Sales[price] ) instead of SUM( Sales[line amount] ) importing line amount.
  • Reduce precision of number to reduce distinct values (i.e. round to integer if decimal values are not relevant).

The reason is that VertiPaq compress data at column level, creating a dictionary for each column and storing for each row only the number of bits required to store the index to the dictionary. More details in the article Optimizing High Cardinality Columns in VertiPaq I wrote a few months ago and on the SSAS 2012 Tabular book.

A useful macro to analyze memory consumption and quickly identify the most expensive tables and columns in a PowerPivot workbook is available on Kasper De Jonge blog What is eating up my memory the PowerPivot / Excel edition. There is also a version for a Tabular database in his What is using all that memory on my Analysis server instance post.