SUMMARIZECOLUMNS is the most widely used function in Power BI queries, which can also be used in other DAX expressions, like DAX measures.
An important and unique feature of SUMMARIZECOLUMNS is that it determines automatically how to scan the model to produce its result. Indeed, when using SUMMARIZE, GROUPBY, ADDCOLUMNS, or any of the more basic querying functions, developers must declare the source table to perform the grouping, as well as the group-by columns and the measures to add to the result. On the other hand, SUMMARIZECOLUMNS requires only the group-by columns; there is no need to provide the source table, which is the primary ingredient of any query. SUMMARIZECOLUMNS determines the structure of the result independently, utilizing a sophisticated algorithm that requires some understanding, as it is based on features and concepts such as auto-exists (also known as clustering), Non-empty, and value filter.
A skilled DAX developer should be familiar with the details of the three features that enable them to reshape their measures, models, or queries to obtain accurate results.
These features are heavily interconnected. Understanding the behavior of the value filter, or the subtleties of NONVISUAL, without a deep understanding of auto-exists results in a frustrating experience. The goal of this paper is to explain how SUMMARIZECOLUMNS works and to provide best practices for using it in queries and DAX expressions.
This paper is not an introduction to SUMMARIZECOLUMNS. We assume our readers are familiar with the syntax of SUMMARIZECOLUMNS, evaluation contexts, context transition, and all the fundamental concepts of DAX.
The content has the following modules:
- Goals and prerequisites
- Why is it so crucial to understand SUMMARIZECOLUMNS?
- Introducing SUMMARIZECOLUMNS results
- Introducing group-by tuples
- Understanding non-empty
- Understanding NONVISUAL and how filters are applied
- Understanding CROSSJOIN versus SUMMARIZE
- Understanding clustering and auto-exists
- Understanding filters spanning multiple tables
- Understanding value filter behavior
- Putting things together
- Execution steps of SUMMARIZECOLUMNS
- Examples with subtotals
- Moving filters outside of SUMMARIZECOLUMNS
- Understanding ALLSELECTED in SUMMARIZECOLUMNS
- SUMMARIZECOLUMNS and external filters
- Internal filters overriding outer filters
- Group-by columns keeping outer filters on the same table only
- SUMMARIZECOLUMNS and data lineage
- SUMMARIZECOLUMNS best practices
- Basic grouping, with no measures
- Basic grouping, with measures
- Complex grouping
- Grouping and filtering
- Conclusions
Additional content: Sample file used for the demos in the whitepaper.