Rolling 12 Months Average in DAX

Rolling Average 12 Months 01

Computing the rolling 12-month average in DAX looks like a simple task, but it hides some complexity. This article explains how to write the best formula avoiding common pitfalls using time intelligence functions.

We start with the usual AdventureWorks data model, with Products, Sales and Calendar table. The Calendar has been marked as a calendar table (it is necessary to work with any time intelligence function) and we built a simple hierarchy year-month-date. With this set up, it is very easy to create a first PivotTable showing sales over time:

Rolling Average 12 Months 01

When doing trend analysis, if sales are subject to seasonality or, more generally, if you want to remove the effect of peaks and drops in sales, a common technique is that of computing the value over a given period, usually 12 months, and average it. The rolling average over 12 months provides a smooth indicator of the trend and it is very useful in charts.

Given a date, we can compute the 12-month rolling average with this formula, which still has some problems that we will solve later:

Sales12M := CALCULATE (
    [Sales],
    DATESBETWEEN (
        Calendar[FullDate],
        SAMEPERIODLASTYEAR ( NEXTDAY ( LASTDATE ( Calendar[FullDate] ) ) ),
        LASTDATE ( Calendar[FullDate] )
    )
)

The behavior of the formula is simple: it computes the value of [Sales] after creating a filter on the calendar that shows exactly one full year of data. The core of the formula is the DATESBETWEEN, which returns an inclusive set of dates between the two boundaries. The lower one is:

SAMEPERIODLASTYEAR ( NEXTDAY ( LASTDATE ( Calendar[FullDate] ) ) )

Reading it from the innermost: if we are showing data for a month, say July 2007, we take the last visible date using LASTDATE, which returns the last day in July 2007. Then we use NEXTDAY to take the 1st of August 2007 and we finally use SAMEPERIODLASTYEAR to shift it back one year, yielding 1st of August 2006. The upper boundary is simply LASTDATE, i.e. end of July 2007.

If we use this formula in a PivotTable, the result looks fine, but we have a problem for the last date:

Rolling Average 12 Months 02

In fact, as you can see in the figure, the value is correctly computed until 2008. Then, there is no value in 2009 (which is correct, we don’t have sales in 2009) but there is a surprising value on December 2010, where our formula shows the grand total instead of a blank value, as we would expect.

In fact, on December, LASTDATE returns the last day of the year and NEXTDAY should return the 1st of January 2011. But NEXTDAY is a time intelligence function and it is expected to return sets of existing dates. This fact is not very evident and it is worth a few words more.

Time intelligence functions do not perform math on dates. If you want to take the day after a given date, you can simply add 1 to any date column, and the result will be the next day. Instead, time intelligence functions shift sets of date back and forth over time. Thus, NEXTDAY takes its input (in our case a single-row table with the 31 of December 2010) and shifts it one day later. The problem is that the result should be 1st of January 2011 but, because the Calendar table does not contain that date, the result is BLANK.

Thus, our expression computes [Sales] with a blank lower boundary, which means the beginning of time, yielding as the result the grand total of sales. To correct the formula it is enough to change the evaluation order of the lower boundary:

Sales12M := CALCULATE (
    [Sales],
    DATESBETWEEN (
        Calendar[FullDate],
        NEXTDAY ( SAMEPERIODLASTYEAR ( LASTDATE ( Calendar[FullDate] ) ) ),
        LASTDATE ( Calendar[FullDate] )
    )
)

As you can see, now NEXTDAY is called after the shift of one-year back. In this way, we take 31 of December 2010, move it to 31 December 2009 and take the next day, which is 1st of January 2010: an existing date in the calendar table.

The result is now the expected one:

Rolling Average 12 Months 03

At this point, we need only to divide that number by 12 to obtain the rolling average. But, as you can easily imagine, we cannot always divide it by 12. In fact, at the beginning of the period there are not 12 months to aggregate, but a lower number. We need to compute the number of months for which there are sales. This can be accomplished using cross filtering of the calendar table with the sales table after we applied the new 12 months context. We define a new measure that computes the number of existing months in the 12 months period:

Months12M := CALCULATE (
    CALCULATE ( COUNTROWS ( VALUES ( Calendar[MonthName] ) ), Sales ),
    DATESBETWEEN (
        Calendar[FullDate],
        NEXTDAY ( SAMEPERIODLASTYEAR (LASTDATE ( Calendar[FullDate] ) ) ),
        LASTDATE ( Calendar[FullDate] )
    )
)

You can see in the next figure that the Months12M measure computes a correct value:

Rolling Average 12 Months 04

It is worth to note that the formula does not work if you choose a period longer than 12 months, because the Calendar[MonthName] has only 12 values. If you need longer periods, you will need to use an YYYYMM column to be able to count more than 12.

The interesting part of this formula that uses cross filtering is the fact that it computes the number of available months even when you filter using other attributes. If, for example, you select the Blue color using a slicer, then sales start in July 2007 (not in 2005, as it happens for many other colors). Using the cross filter on Sales, the formula correctly computes that in July 2007 there is a single month of available sales for Blue:

Rolling Average 12 Months 05

At this point, the rolling average is just a DIVIDE away:

Avg12M := DIVIDE ( [Sales12M], [Months12M] )

When we use it in a Pivot Table, we still have a small issue: in fact, the value is computed also for months for which there are no sales (i.e. future months):

Rolling Average 12 Months 06

This can be solved using an IF statement to prevent the formula from showing values when there are no sales. I have nothing against IF but, for the performance-addicted among you, it is always worth remembering that IF might be a performance killer, because it could force DAX formula engine to kick in. In this specific case, the difference is negligible, but, as a general rule, the best way to remove the value when there are no sales is to rely on pure storage engine formulas like this one:

Avg12M := DIVIDE ( [Sales], [Sales] ) * DIVIDE ( [Sales12M], [Months12M] )

Comparing a chart using the Avg12M with another one that shows Sales you can easily appreciate how the rolling average outlines trends in a much cleaner way:

Rolling Average 12 Months 07

Download


Download Demo (ZIP)