Plenty to read!

Plenty to read!

Track Changes in Power BI: Part 1 - Introduction

Track Changes in Power BI: Part 1 - Introduction


SET UP SOURCE CONTROL / VERSION CONTROL IN POWER BI

…using Tabular Editor to deserialize & scale with model metadata


 

[Click to enlarge] - Track changes in individual objects of your Power BI model, like measures, tables etc



1. Introduction: Source Control made easier with the Tabular Editor Folder Structure

What is Source Control & why is it important?

Life without source control can be… messy

It’s no big secret that collaborative, sustainable development with source control can be challenging in Power BI.
It’s important to keep a running history of what has changed in your data model & reports, and to manage how these changes are implemented. Without this tracking or control, changes are ad hoc, creating a chaotic, unsustainable process in your workflow. What’s worse, this ad hoc way of working increases the risk of mistakes that result in the business seeing incorrect data or broken features in your report.

Tracking and managing these changes across versions of a report or dataset is known as Version Control. This is often interchangeably used with Source Control, when referring to changes in source code of a dataset, such as your DAX, M/Power Query & model metadata. When hearing about Source Control Management (SCM), you may often hear it used near the termCI/CD’, or Continuous Integration/Continuous Deployment’. CI/CD goes a step beyond Source Control to integrate and automate processes between development and operations. This assures, for example, that committed changes made by different people can be merged, automatically tested, then deployed for use, with minimal or no manual intervention.

 

How can we do Source Control in Power BI?

Overview of some commonly seen methods of Source Control in Power BI (Jan 2022)

Note that this article does not yet cover the newly released PBI-Tools for Source Control.

Knowing what changes a colleague has made, or documenting & restoring specific versions requires a lot of manual effort. Outside of Deployment Pipelines - a Premium feature in Power BI service - there are no real native tools making version control easy or even possible. Many business data teams might only use SharePoint, OneDrive or another cloud storage service to store, back-up & organize the .pbix files, if source/version control is done at all. IT teams with existing SCM processes may do the same but with .pbix files in a Git Repository. While the latter has some advantages - for example, integration & automation with Azure DevOps - tracking changes at a file level is not helpful, since you have no view on how the source code in your model is changing. You need a more granular view on what’s changing in your model or reports. Imagine the below:

  • Track changes in a complex DAX or M expression & restore a previous version without disrupting the rest of your model.

  • Compare (diff.) objects between two datasets to selectively merge changes, without manually opening & manipulating the .pbix files.

  • Link automated testing processes to target specific model objects when they are changed, to ensure changing or optimizing code does not result in regression / broken visuals.

  • Use a branching strategy to collaboratively work on the same Power BI dataset as other colleagues without overwriting changes

TE Folder structure

Thankfully, however, there are external tools which help make this easier. One example is Tabular Editor (TE), a tool which helps you develop better tabular data models more effectively. Of the many features in TE, you can save a model in a serialized folder structure for future development, which splits up each object into it’s own .json file. You then connect to this folder in TE when developing your model, and can easily deploy it to the service, or - if you have the appropriate license - can use TE3’s Workspace Mode to make this process more efficient, something we will examine in Part 3 of this series on Source Control.

Bottom contents of tables > Date

This folder structure aides source control in SCM tools like Git, to track compare file changes in an intuitive user interface. Instead of tracking changes in the entire file, each object - such as a measure, a table, a relationship - will have its own, independent lineage. For a more detailed overview of Tabular Editor, its licenses and features, see this article I have previously written.


 

Good tools make development easier

 

Why is Tabular Editor’s “Save to Folder” format helpful for Source Control?

While connected to a Power BI dataset in TE, you can save the dataset as different formats. These files do not contain any of your data, but only the metadata which describes your dataset; the DAX code, Power Query, properties, relationships, and so forth. These metadata files can be used in the source control process; a change in metadata (i.e. a change in a DAX measure) will be evident in this file. To save as these files in both TE2 and TE3, you can save as a .bim format or “Save to Folder…”. The former (.bim) creates a single, flattened .json file for your model, whereas the latter (Save to Folder) will create the serialized folder structure.

As mentioned above, the main difference is the folder structure separates your model where each object has its own .json file, contained in a semantically logical directory (like ‘tables’ > ‘Date’). A comparison of .bim files vs. folder structure, and a description of the each object in the folder structure can be found below:

Example of a measure .json - LatestYear.json

  • A .bim file is a .json file containing descriptive information about your tabular model. This Analysis Services model metadata has everything about your model - from preferences, source code & object properties - but contains no data.

    The .bim file is a separate save format from the TE folder structure, but essentially is this folder structure flattened into a single file. The main difference is thus in the folder structure, the objects are divided into separate .json files so it is easier to track objects separately. In a .bim file, changes are always managed in the one file; it’s more effort to track & manage changes for specific objects.

    A .bim file can be opened or edited in other tools, and deployed from those same tools to the Power BI Service. These tools include Tabular Editor, ALM Toolkit & PBI Tools.

    The folder structure, however, can only be opened and deployed by Tabular Editor.

    In general, it’s recommended to use the Folder Structure* when developing a Power BI data model. This goes double if you have a TE3 Business or TE3 Enterprise license, which allows Workspace Mode. In Workspace Mode, simultaneous connection between the folder structure & service can streamline productivity.

    *An exception is if you do not have Power BI Premium, as Power BI Pro does not allow XMLA Read/Write for Tabular Editor to deploy a model or write changes to it. If the latter case is true, you may seek to deploy using the Power BI APIs & custom PowerShell solutions.

  • A .json file containing descriptive information of the model, such as:

    • Name: Model name

    • ID: Model guid

    • Compatibility level: 4-digit number describing the compatibility

    • Power Query groups* (folders), orders & names of queries therein.

    • Other model annotations

    • When & how it was saved to the folder structure from Tabular Editor (and by which user)

      *NOTE that queryGroups are not viewable or editable from the Tabular Editor UI in TE2 or TE3. Thus if you copy an expression from one model to another in TE, and that query is located in a query group, it will produce an error when attempting evaluation, which you can’t resolve from the TE UI itself, to my knowledge. You will need to edit this database.json to remove the query group before copying the query, or add it in to the new model’s definition.

  • User settings and Workspace Mode information (see part 3 about Workspace Mode).

    This file will be empty if no specific settings are active, or Workspace Mode was declined when connecting to the folder structure from Tabular Editor 3 for the first time.

  • The folder structure creates folders for each object group (i.e. tables, expressions) with each individual object therein as a .json file. For example, measures will be located inside the folder ‘Tables’ under the respective name of their ‘Home Table’. Individual objects have their properties all inside the definition.

Example of database.json file contents. Note the “queryGroups” key, which specifies an array of the folders where queries are organized


What is a lineageTag?
You may have seen in some of the object definitions a property called “lineageTag”. This is a GUID to identify the object in the Tabular Object Model. If you duplicate objects, make sure the lineageTag is unique, otherwise you will get errors when writing changes to the dataset


Altering the folder structure

It is recommended to use the default settings for the ‘Save to Folder’ format in Tabular Editor. However, like with many features in TE, there is flexibility here. In Tabular Editor 3, you can adjust these settings to your own preference, for example if you prefer more or less granular object serialization in the model. This might be necessary, for example, if you are adopting Tabular Editor into an environment where you already have a Power BI source control system in place.

Adjust the Save-to-folder settings in Tabular Editor to your needs

Developing with metadata files instead of a .pbix

While useful for documentation, the .bim file and folder structure formats serve a functional purpose. Tabular Editor can also read and write these file formats. This is in fact the essence of the source control process; instead of working on the .pbix or data model in the service, you work with the .bim or folder structure, which is then deployed to the service when you want to test your changes. The only difference is after deploying, you must first refresh your dataset before querying data, unless you are working in TE3’s Workspace Mode, which we will see in part 3.

You work from these files because you want to follow a workflow, illustrated conceptually below:

Scenario A (Preferred)

  1. Working changes are made to the metadata file

  2. The changes are saved (written to the file)

  3. These changes are typically:

    • Staged in logical groups

    • Committed with a note

    • Pushed to a remote repository, which tracks the changes & versions of the file

  4. Once approved, the changes are deployed to the service for testing

This is in principal not so different from the below scenario, which might be more familiar to a non-IT Power BI developer:

Scenario B (Not Preferred*)

  1. Working changes are made to the .pbix

  2. Changes are saved & synchronized to a OneDrive or SharePoint folder

  3. The .pbix is published to the service for testing and/or deployment

* Of course, one should never overgeneralize, but typically this is the least desired scenario.

Despite conceptual similarities, one may appreciate the advantages of Scenario A over B:

  • Scenario A - changes can be versioned, tracked & compared per model object

    Scenario B - no changes are visible except for file size, unless the file is opened

  • Scenario A - there is a framework for automation (to test, deploy, etc.)

    Scenario B - only minor automation is possible with duct tape & dependencies

  • Scenario A is robust & can mature, rigid by design to enforce governance & development hygiene which enables more scaleable & sustainable environments

    Scenario B is flexible but ad hoc, vulnerable to the human chaos, where collaboration is troublesome, at best.


A simple, high-level overview of different scenarios, visualized


What about the folder structure makes it helpful for source control?

Life’s better with source control in place!

As mentioned already, if each object is self-contained in its own .json, it can have its own independent lineage in your SCM process. This makes it easier to document or compare changes for specific objects in your model - like complex measures or M code in Power Query. For example, if a complex DAX measure has evolved through a dozen or so iterations, it is easy to see what those changes were and - with meaningful commit messages - why they were made.

The folder structure itself is not doing any source control; Tabular Editor has no specific source control functionality (although Git integration is on the roadmap). Rather, this format simply makes it easier for you to use it as an input for your source control processes with other tools. In the next article, we’ll work step-by-step to set up a source control process with this folder structure, so you can leverage this format to easily track changes in your Power BI data model.


We’ve now learned

  • What is Source Control and why is it important

  • What is the TE Folder Structure / .bim save formats, and how to use them

  • How the TE Folder Structure save format enables better Power BI source control

Next up: Walk step-by-step through a tutorial to set up your own Power BI source control, using Tabular Editor 3, Visual Studio and an Azure DevOps repository.

Create your own Source Control process using the TE folder structure and a DevOps Repo


Track Changes in Power BI: Part 2 - Tutorial

Track Changes in Power BI: Part 2 - Tutorial

A Beginner's Guide to Tabular Editor 3: UI Customization & Code Assist Features

A Beginner's Guide to Tabular Editor 3: UI Customization & Code Assist Features

0