Fabric Billing Part 4- Implications of pause and restart

Fabric Billing Part 4- Implications of pause and restart


Pause and restart is a functionality in F SKUs that was not available in the previous iteration of P SKUs. It allows you to pause usage of a capacity and thereby stop charges. However, choosing to pause also incurs an immediate charge equal to the Pay As You Go price of all the CU debt you have accumulated in future timepoints (meaning up to 24 hours of costs). Pauses also cancels any active jobs (e.g. refreshes).

Note: If you are planning to cancel a SKU, you can consider this as an indefinite pause for the purposes of this article.


Why Pause and Restart?

There are two main reasons why you might want to pause a capacity:

  1. To save costs while you are not using it

  2. To clear the capacity debt


Considerations for pausing as a method of saving money

1) The pause need to last long enough to offset the one off cost of clearing the future capacity debt

In extreme scenarios, a pause will only save money if the restart occurs more than 24 hours after the pause. The reason is capacity debt can reach 100% for the future 24 hours period, meaning you would have to pay back 1 days’ worth of capacity costs at the PAYG rate upon pausing.

Let’s imagine you have an F2 which has used 100% of future capacity for the next 24 hours and you want to shut it down over the weekend to save money. When you click pause you have to pay off that debt at pay as you go prices. This would cost you approximately $9.12 (an F2 at PAYG price for 1 day).

In other words, if you are thinking of trying to have the best of both worlds and retain the flexibility of Pay As You Go with costs nearer those of an RI, pausing for 2/7th of the week (I.e. the weekend) may only generate a 1/7th saving, meaning a 14% rather than a 28% saving. Neither of these come close to the ~40% saving of an RI.

If you are wondering why you are charged PAYG prices for a restart even if you have a reserved capacity, the reason is because the reservation grants you a certain amount of capacity 24 hours a day, 365 days a year, split into 30 second time windows. If you pause and restart, you clear your capacity debt which requires paying off a large amount of accumulated debt in the next 30 second window. Basically you are having to pay off more debt than the reservation covers.

Approximate cost of a pause when debt reaches 100% for 24 hours on an F2

Estimating the accumulated debt for the upcoming 24 hours isn't straight forward. Every single activity carried out on a capacity is smoothed over a window (normally 5 minutes for interactive operations and 24 hours for background). The Capacity Metrics App doesn't provide a headline figure of average accumulated debt for the upcoming 24 hours and I don't currently have a reliable method to share for doing so. However, generating a perfect estimate isn't usually needed. Knowing that on a busy capacity a pause will cost somewhere between 50% and 100% of your daily PAYG rate is normally good enough for estimating whether that pause is worthwhile.

Once a pause occurs though, it is possible to make these calculations. Upon pausing you will see a huge spike in the "utilisation" pane of your metrics app. This is because all future debt is aggregated into the next 30 second window. It is not uncommon to see spikes of 200,000%+ utilisation. From this, you can calculate how long a pause must continue to save money.

The methodology for doing this is fairly straightforward. As all debt for the next 24 hours has been aggregated into a single 30 second timepoint, we just have to reverse this process and get an average utilisation for a single 30 second timepoint over the 24 hour window. We do this by dividing the utilisation (200,0000% in the above example) by 2,880 (the number of 30 second time slots in 24 hours). We can then multiply this percentage by 24 hours to work out how many hours you need to pause to break even.

We can also use the same percentage and multiply it by the daily PAYG rate of our capacity to see the monetary cost of the pause. This should match, or be close to, the cost you see on your bill. Below I have included a chart to show the length of pause required to break even (applies to all SKUs) along with the approximate cost of a pause for a European F64. Thanks to my colleague Ross Couldrey, MMA who used a different methodology to reach the same results and gave me the idea.

Table showing minimum pause durations to save money based on utilisation spike after pause.

2) You have more active SKU units than reserved units

Remember that reservations are year long commitments. Pausing a capacity never stops that commitment. This means that pausing is only ever worth doing as a method to save money if you have other active SKUs than in total match or exceed the reservation for that scope (account/ subscription/ resource group/ region- see part 3).

For instance, if you have an F4 SKU and an F4 reservation you would gain no monetary benefit from a pause because you continue to pay the reservation. However, if you had an F2 and F4 SKU and a reservation of F4, you could pause the F2 as a way of saving money (taking into account the cost of the initial pause).

Review reservations before pausing as a method of saving money. In the second example above you would continue paying the F4 reservation even though you only have 2 active SKU units.

Considerations for pausing as a method of clearing capacity debt

Pausing a capacity is an excellent short term mitigation for capacity overload. Imagine you have a small capacity and you accidently refresh a semantic model without applying a filter and import 1 billion rows instead of 1 million. Your capacity has smoothed out the CU over the next 24 hours and now your capacity is so busy users can’t query Power BI reports. It’s month end and everyone’s rather upset. In this scenario waiting 24 hours for the debt to clear is unacceptable.

A pause and restart allows you to clear that accumulated debt immediately at the cost of paying for that debt on the Pay As You Go Rate. As a one off this can be a fantastic tool to have at your disposal and financially a much more cost effective resource than autoscale in the older P SKUs.

However, if your capacity is regularly paused and restarted because of consistent overloads, the higher tariff associated with Pay As You Go plus the inconvenience of jobs being cancelled during the pause may justify a review of the operations taking place on the capacity and, if necessary, the purchase of a larger capacity under an RI.


What’s Next?

The final part of this series looks at scaling up and scaling down capacities. This is a great feature for ensuring your available compute matches your needs and, in some cases, for saving costs. However this can become complex to manage and it is important to have a solid understanding of the concepts discussed in the first four parts of this series to fully understand the implications of scaling up and down.

Lutz Bendlin

Analytics and Insights North America at Hewlett Packard Enterprise

3mo

What about pausing a SKU as a means to reset it because it is misbehaving?

Like
Reply
Greg Deckler

DAX is easy, CALCULATE makes DAX hard...

3mo

So, if I understand this correctly, you can never get ahead, you can only ever go into debt. And, since you are constantly kept in debt, when you restart or pause a reserved instance, you not only continue to pay for the reservation (its a reservation after all) but then on top of that you are up charged an additional 170% to pay for the PAYG. So, any debt effectively costs you 270%. Did I get that correct?

Like
Reply

Matthew Farrow Thanks for an awesome articles series! As I understand it, background rejection stops new jobs from being run, but doesn't stop already running jobs. Let's say our capacity is already at a stable plateau at 90% utilization. Then someone on the capacity introduces a poorly designed job that uses bursting (let's say 3x the capacity) and runs for 24 hours. Is it possible that such a job will cause more than 24 hours overhead on a capacity? Thanks

Like
Reply
Dat TRIEU

Data enthusiast | Microsoft Certified Associate: (1) Fabric Analytics Engineer, (2) Azure Enterprise Data Analyst, (3) Power BI Data Analyst | Blogger | MSc - Quantitative Economics

3mo

Thanks Matthew Farrow for this fantastic series. If I understand you correctly, when our capacity (F64 RI price) is overloaded (not so often), we pause and resume it immediately, we pay an extra maximum of 1 day of F64 (with PAYG price) right? And to be more precise, the extra payment is somewhere between 0 and 1 day of F64 PAYG, depending on the debt that we incurred while being overloaded?

Like
Reply
Wojciech Bukowski

➥ 🅳🅐🅣🅐 🅿🅛🅐🅣🅕🅞🅡🅜 🅰🅡🅒🅗🅘🅣🅔🅒🅣

3mo

Does that only apply for "dept" - bursting amount ? Or if I have let's say 1 pipeline that run for 1 hour and I stop capacity after the job is finished (and let's assume this job did not burst my capacity) . Will I pay anything extra after pausing ?

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics