Filter Arguments in CALCULATE

A filter argument in CALCULATE is always an iterator. Finding the right granularity for it is important to control the result and the performance. This article describes the options available to create complex filters in DAX.

NOTE: This article is about table filter arguments, and does not consider directive arguments (such as ALL, USERELATIONSHIP, CROSSFILTER, …) that alter the filter context without applying a list of values as a new filter.

CALCULATE Filters Fundamentals

When you write a CALCULATE statement, all the filter arguments are table expressions, such as a list of values for one or more columns, or for an entire table. For example, when you write:

CALCULATE ( 
    <expression>, 
    table[column] = <value> 
)

In reality the filter expression you wrote is transformed in:

CALCULATE ( 
    <expression>, 
    FILTER ( 
        ALL ( table[column] ),
        table[column] = <value> 
    )
)

This behavior is identical for all the filter arguments of CALCULATE and CALCULATETABLE. From hereinafter, we will describe the syntax of the filter arguments in these functions, identified by in the general syntax:

CALCULATE ( , , , …, )

CALCULATETABLE (

, , , …, )

A filter function can be a logical expression or a table expression:

table[column] =

FILTER ( … )

ALL ( … )

… (any other table expression is allowed in a filter argument).

Complex Filter Arguments

The DAX syntax of the automatic FILTER function generated by DAX in place of a logical expression requires that you express a single column in the filter expression. For this reason, you can write:

[Sales Red or Blue] :=
CALCULATE (
    [Sales Amount],
    Product[Color] = "Red" || Product[Color] = "Blue"
)

The syntax above is internally transformed in the following one, which you might write in an explicit way obtaining the same behavior from your DAX measure.

[Sales Red or Blue] :=
CALCULATE (
    [Sales Amount],
    FILTER ( 
        ALL ( Product[Color] ),
        Product[Color] = "Red" || Product[Color] = "Blue"
    )
)

However, you cannot write a single filter argument referencing two different columns

[Sales Red or Contoso - invalid] :=
CALCULATE ( 
    [Sales Amount], 
    'Product'[Color] = "Red" || 'Product'[Brand] = "Contoso" 
)

In this case you have to write an explicit table expression instead of relying on automatic conversion of a logical expression in a table expression made by CALCULATE and CALCULATETABLE when you reference a single column. You have several options available, producing different results and potentially with different performance.

Table Filter

You can write a filter over two columns using a filter over the entire table that contains both columns.

[Sales Red or Contoso - table filter] :=
CALCULATE ( 
    [Sales Amount], 
    FILTER ( 
        'Product', 
        'Product'[Color] = "Red" || 'Product'[Brand] = "Contoso" 
    )
)

Using a table filter, you inherit the filter argument existing for the Product table, so you will not include a product Red or of the Contoso brand if it was not present in the existing filter. What is more important, you will not override the existing filter on such a column. Thus, if you have a slicer filtering the brand Proseware, you will see the sales amount of only the products Red belonging to Proseware brand, ignoring any product of the Contoso brand.

ALL Columns Filter

You can write a filter over two columns by creating a special table having only the columns you need. By using the ALL function, you get a table having all the unique combinations of values existing in the underlying table for the referenced columns.

[Sales Red or Contoso - ALL columns filter] :=
CALCULATE ( 
    [Sales Amount], 
    FILTER ( 
        ALL ( 'Product'[Color], 'Product'[Brand] ),
        'Product'[Color] = "Red" || 'Product'[Brand] = "Contoso" 
    )
)

In this case, the cardinality of the filter can be lower than the Cartesian product of the values you have in the referenced columns. In order to obtain such a list, the engine has to execute a table scan. This could be expensive for low cardinality columns in a large table.

The result of this filter will override any existing filter over the specified columns. For example, if you have a slicer filtering the brand Proseware, you will see the sales amount of the products Red belonging to Proseware brand, summed to the sales of the entire Contoso brand, regardless of the color.

CROSSJOIN Columns Filter

An alternative approach to the ALL filter described in the previous section is using a CROSSJOIN over all the values of the two columns. Using CROSSJOIN, you obtain all the possible combination of the values you have in the columns referenced, regardless of the fact that the combination exists in the underlying table.

[Sales Red or Contoso - CROSSJOIN columns filter] :=
CALCULATE ( 
    [Sales Amount], 
    FILTER ( 
        CROSSJOIN ( 
            ALL ( 'Product'[Color] ), 
            ALL ( 'Product'[Brand] )
        ),
        'Product'[Color] = "Red" || 'Product'[Brand] = "Contoso" 
    )
)

In this case, the cardinality of the filter is identical to the Cartesian product of the values you have in the referenced columns. In order to obtain such a list, the engine does not perform a table scan, but only uses the list of values available in the two columns. This could be expensive for high cardinality columns that have a high correlation, so that the number of existing combinations in the table is much lower than all the possible combinations.

The result of this filter is identical to the ALL columns filter, you might just observe different performance in the two approaches. The best one depends on the cardinality of the table and of the columns involved in the filter. However, you can use CROSSJOIN to combine columns of different tables, which is not possible using the ALL syntax.

SUMMARIZE Columns Filter

If you do not want the filter replacement behavior you have using ALL and CROSSJOIN, but you want to keep the existing filter as you have using the table filter, you can use KEEPFILTERS wrapping the ALL/CROSSJOIN filter, or you can use SUMMARIZE. This is the syntax using KEEPFILTERS:

[Sales Red or Contoso - ALL KEEPFILTERS] :=
CALCULATE ( 
    [Sales Amount], 
    KEEPFILTERS (
        FILTER ( 
            ALL ( 'Product'[Color], 'Product'[Brand] ),
            'Product'[Color] = "Red" || 'Product'[Brand] = "Contoso" 
        )
    )
)
[Sales Red or Contoso - CROSSJOIN KEEPFILTERS] :=
CALCULATE ( 
    [Sales Amount], 
    KEEPFILTERS ( 
        FILTER ( 
            CROSSJOIN ( 
                ALL ( 'Product'[Color] ), 
                ALL ( 'Product'[Brand] )
            ),
            'Product'[Color] = "Red" || 'Product'[Brand] = "Contoso" 
        )
    )
)

The SUMMARIZE function generates a list of the existing combinations between two or more columns, and can be used with columns belonging to different tables if they are connected in a many-to-one relationships chain.

[Sales Red or Contoso - SUMMARIZE filter] :=
CALCULATE ( 
    [Sales Amount], 
    FILTER ( 
        SUMMARIZE ( 
            'Product',
            'Product'[Color], 
            'Product'[Brand]
        ),
        'Product'[Color] = "Red" || 'Product'[Brand] = "Contoso" 
    )
)

In this case, the cardinality of the filter is reduced compared to ALL/CROSSJOIN, but you pay the cost of a table scan to obtain the existing combinations of the columns specified in SUMMARIZE. This could be expensive for low cardinality columns in a large table. It could be potentially faster than the table scan for a complex filter condition, but in terms of performance you have to consider whether alternative KEEPFILTERS syntax could be better, depending on data distribution.

Conclusions

You have a number of options to specify a complex filter in a CALCULATE statement. The result of a filter argument is always a table with one or more columns, and the “cost” of the filter is the number of rows you have in such a table. The filter table is usually the “easy” way to write a valid complex filter expression, but it could have a large granularity for the FILTER iterator and a higher cost for the filter itself in CALCULATE, considering the related cost of an expanded table in a filter argument.

You can find more details about the internal behavior and the related performance in The Definitive Guide to DAX. For a complete understanding of the differences between a table filter and a column filter in CALCULATE and their implications, we suggest reading the Chapter 10, Advanced evaluation context.

In order to make practice with the different syntaxes, you can download an Excel workbook with the measures described in this article applied to a pivot table with different filters and slicers, comparing the different results. You can appreciate different performance only on larger data models.

Download

Download Demo (ZIP)