I keep on telling my students how important is – with DAX – to spend time learning the theory because it is nearly impossible to understand the subtle difference between a correct measure and a wrong one, if you don’t pay attention to small details. In this article, I want to share with you some considerations about an example that came out during a class about how to optimize the code.
The requirement was to compute the average sales amount for only the sales where quantity is greater than three and then slice it by color. Don’t waste your time: it is a completely useless measure, the goal was to optimize the behavior of the query, not that of finding any insight from it.
While trying different versions of the measure, one student came with this version:
DEFINE MEASURE Sales[Average Sales Amount] = AVERAGEX ( Sales, 'Sales'[Quantity] * 'Sales'[Net Price] ) EVALUATE ADDCOLUMNS ( VALUES ( Product[Color] ), "Sales", CALCULATE( [Average Sales Amount], FILTER ( Sales, Sales[Quantity] > 3 ) ) )
Now, before reading further, spend some time to look back at the query and try to find it out yourself. It contains a couple of small glitches that – together – create the problem. Believe me, finding it requires some DAX skills and it is totally worth the time. If you just read the article without trying it by yourself, it is likely that you will not appreciate those small details.
Where are the issues in this code? Why does it always return the same value? There are two very common errors, worth pointing out:
- Tables in DAX are always expanded, if you filter a table using CALCULATE, you filter also all of the related ones, following a many-to-one pattern.
- CALCULATE executes context transition, but its filter parameters get evaluated in the original filter context, not in the one modified by CALCULATE.
Table Expansion at work
Let us start from the first problem. The filter parameter of CALCULATE is FILTER (Sales, Sales[Quantity] > 3). There is nothing strange with this code, but you need to notice that FILTER returns the whole Sales table. It looks like an innocent filter on the Sales[Quantity] column, but – indeed – it is the first source of the problem.
In fact, as you might know, in DAX tables are expanded following many-to-one relationships, so that Sales includes, among its expanded columns, all of the columns of Product. As a result, when you place a filter on Sales, it results in applying the filter not only on the Sales table, but also on the Product table (and all the tables of the model, even if this is not relevant to this example).
Thus, the filter context created by CALCULATE, right before calling [Average Sales Amount] includes a filter on Product too, replacing any previously existing filter on all of its column. What are the products filtered? All of the products that are referenced from Sales.
Alone, this is not still enough to produce wrong results. In fact, in order to understand the whole scenario, we need to elaborate on the second issue, that is the context transition generated by CALCULATE during the iteration of ADDCOLUMNS.
How CALCULATE evaluates filter parameters
CALCULATE does the following:
- Evaluates its filter parameters. In our case, the FILTER over Sales where Quantity is greater than 3.
- Generates a new filter context, overwriting previously existing filters on the same columns it is about to filter.
This is all well-known, but why is it relevant to our scenario? Because ADDCOLUMNS iterates on the Product[Color] column, then it calls CALCULATE to force context transition and evaluate the measure in a filter context containing the product color. Nevertheless, CALCULATE evaluates its filter parameters before generating the new filter context. In other words, the filter parameters of CALCULATE are evaluated in the original filter context, not in the one that CALCULATE is about to produce.
When you write FILTER ( Sales ) you are referring to all of the sales, not only to the red ones. The filter on color (say we are filtering Red) is applied only later.
Mixing the effects together
Now that we elaborated on these two points, let us recap them and mix their effects. So far, we discovered that the result of FILTER:
- is the list of ALL the rows in sales with a quantity greater than three, no matter what their colors are. In fact, Product[Color] is NOT part of the filter context when FILTER is iterating through Sales;
- contains all the columns of the expanded Sales table. Among them, also the columns of Product, noticeably, Product[Color].
When context transition happens, CALCULATE needs to generate the new filter context based on the actual row context (due to context transition) and to its filter parameter. The question is: which color will be selected in this new filter context? Unfortunately, it will not be the color currently iterated by ADDCOLUMNS, it is a completely different one.
In fact, CALCULATE will need to merge two filter contexts:
- The first one, containing the currently iterated color, generated by context transition.
- The second one, containing all columns of the expanded table of Sales, that contains all the product that sold more than three items at once.
There is very clear order of precedence in the merging of the two contexts: the one induced by context transition is applied first and, then, the filter arguments are applied later, possibly overwriting the context generated by context transition.
At the end, this is what happens:
- Context transition requires to filter on a specific color, let say it is Red.
- The filter argument (which has the precedence) requires the color to be “the color of any product that sold more than three”.
Since the latter filter is applied later, it overwrites the selection of red products, and this obviously happens for all the rows during the iteration. As a result, the formula returns the same value for all the rows.
Solving the issue
The correct formulation of the formula is a very simple one but, before looking at it, is worth looking at the solution that the student proposed. In fact, this was the solution:
DEFINE MEASURE Sales[Average Sales Amount] = AVERAGEX ( Sales, 'Sales'[Quantity] * 'Sales'[Net Price] ) EVALUATE ADDCOLUMNS ( VALUES ( Product[Color] ), "Sales", CALCULATE ( CALCULATE ( [Average Sales Amount], FILTER ( Sales, Sales[Quantity] > 3 ) ) ) )
By adding a CALCULATE outside of the inner one, the developer obtained the effect of evaluating the FILTER function inside a filter context containing only red products, because the outer CALCULATE already consolidated the row context of ADDCOLUMNS into a filter context. The solution works, but it is just overly complicated, since the correct formulation of the code is a much easier one:
DEFINE MEASURE Sales[Average Sales Amount] = AVERAGEX ( Sales, 'Sales'[Quantity] * 'Sales'[Net Price] ) EVALUATE ADDCOLUMNS ( VALUES ( Product[Color] ), "Sales", CALCULATE ( [Average Sales Amount], Sales[Quantity] > 3 ) )
This last version of the formula does not suffer from the issues of the previous one because, now, the filter parameter of CALCULATE is using only the Sales[Quantity] column. Thus, it will not spread its effects to the whole Product table, as a filter on Sales did.
There are few golden rules in the DAX world, but one for sure is: never filter a table when you can filter a column. In fact, filtering a table requires you to understand expanded tables and their behavior when you use them as a filter parameter in CALCULATE.
Evaluates an expression in a context modified by filters.
CALCULATE ( <Expression> [, <Filter> [, <Filter> [, … ] ] ] )
Returns a table that has been filtered.
FILTER ( <Table>, <FilterExpression> )
Returns a table with new columns specified by the DAX expressions.
ADDCOLUMNS ( <Table>, <Name>, <Expression> [, <Name>, <Expression> [, … ] ] )
Returns all the rows in a table, or all the values in a column, ignoring any filters that might have been applied.
ALL ( [<TableNameOrColumnName>] [, <ColumnName> [, <ColumnName> [, … ] ] ] )