In the last months I and Alberto Ferrari worked on many consulting engagements about designing Tabular models or optimizing existing ones. One of the strength of the Tabular model compared to Multidimensional is the performance of distinct count measures. Now, being faster doesn’t necessarily mean getting an immediate answer, and several times we had to work on performance improvements for this type of measure. The matter is very complex and we are still not ready to publish a set of guidelines about how to optimize distinct count measures in Tabular.
However, during our tests we made interesting discoveries about different query plans you can obtain for distinct count calculations, which can be made using different techniques in DAX. First, I want to warn you that there are no silver bullets. Each technique might be faster in certain conditions but slower in other, so a big mistake is thinking that you found a better solution than a DISTINCTCOUNT function call in DAX and you apply the alternative way to any measure in any model. As I said, don’t do that.
Now, if you want to know more and want to discover how to read DAX query plans and how to measure the best option for your specific model, then this new paper written by Alberto Ferrari is for you:
We already made much more discoveries, but we need to investigate more in order to provide common guidelines. It would be very wrong publishing some best practices that only work in certain conditions, and worsen performance in many other. But if you have queries running in too many seconds (or minutes), please contact me: I’d like to gather more use cases on this topic!
Counts the number of distinct values in a column.
DISTINCTCOUNT ( <ColumnName> )