Articles   \  

Separate date and time in PowerPivot (and BISM tabular)



In PowerPivot you can import data from several sources and it is often the case that you import a DateTime column from a database. From an analytical point of view, you usually make analysis over dates (day/month/year/quarter/week) and over time (hour/minute) but to do that it is often useful to have two independent set of attributes: one for the date (the Calendar table) and one for time (for those businesses where this element is relevant). When your table is not too big, you might import the table with the original DateTime column and only then you can create derived calculated columns in DAX. However, when your table is big (i.e. millions of rows, or more), then it is a good idea to make the calculation outside PowerPivot.

For example: if you have 15 millions rows for one month, assuming that the datetime has a granularity of one second and you have a linear distribution over 24 hours, you will have a dictionary of about 2.5 millions of unique values for just this column. This makes that column pretty expensive. Now, when you split the column in SQL, you might save a lot of space. If you have SQL Server 2008 R2, you can use the following syntax in a SQL query (or a SQL view) in order to make the split in a smart way.

This if for the date:

CAST( DateColumnName AS DATE)

PowerPivot still reads the column as a DateTime, but hour/minute/seconds disappears and the number of unique values is reduced to the number of distinct days in your data. Making it easier to join with a Calendar table, of course!

This is for the time rounding to seconds:

CAST( CAST( DateColumnName AS SMALLDATETIME ) AS TIME(0) )

Again, PowerPivot read this as a DateTime and put a date anyway (the current day when you import data by using the formula above) – however, you probably will derive hour and minute in calculated columns (but doing that in SQL is a good idea, too).

In case you want to round the time to the minute instead of seconds, you can leverage on SMALLDATETIME data type by using this cast:

CAST( CAST( DateColumnName AS SMALLDATETIME ) AS TIME )

Using this technique I saved a 15 million rows table into an Excel workbook of 66 Mb, using no more than 400Mb or RAM once loaded in memory. Remember, columns with unique values for each row are the most expensive ones in PowerPivot!


Article originally appeared on source










 
Want to read more?