Monday, April 24, 2017

Azure Cost Optimizations: Azure VMs

Compute time is one of the most expensive services one can use in the Azure stack.  The goal of this post is to discuss some of the ways you can optimize the amount of compute you use.  Please note that the compute profile of your workloads will vary, so your mileage using any of the suggestions below will vary.

1) Monitor monitor monitor

Azure has recently released a few new features from a monitoring perspective.  Most of these capabilities are merging into what they call Azure Monitor.  What I particularly like about Azure monitor is the ability to see both host level and guest level metrics.  As with all performance monitoring of this type, you need to have a good understanding of how the data is sampled on the system.


As you can see, I've made a few changes to the sample rate for my VMs.  This was particularly important during load testing I was conducting at my client.

The key point here is that you can start to use these monitoring tools to help determine more appropriate VM sizes for your workloads.  Integrating tools such as OMS can greatly help as you can start to trend performance over time.

One tool that I have run into but have not had the chance to use is the Azure Virtual Machine Optimization Assessment.

2) Switch to batched/scheduled workloads

As you pay for compute only when your virtual machine is running, you can start to play around with batched and/or scheduled workloads.  The essence here is to find an orchestration tool that you can use to manage when your virtual machines are running.  Taking this one step forward, ideally the orchestration engine also only runs the virtual machine for the required amount of time.

Two services that one can look at here are Azure Batch and Azure VMSS.  The later here is more around autoscaling and trying to achieve performance curves that match closer to the demand curves.

3)  Shutdown/Startup VMs

One technique that I am particularly fond of is the automatic startup and shutdown of VMs when they are required.  There are several different ways to accomplish this in the Azure cloud, including using Azure Dev/Test Labs and Azure Automation.  The former has created a VM extension that can be used to autoshutdown the machines.  There is even some built-in runbooks for using tags to schedule startup and shutdown.

There are a few pros/cons to the above approaches which I can cover in another post, but suffice to say, one thing to consider here is the order in which startup/shutdown occur.  Many environments, even dev/test, can be complex and have dependencies.  Generally for this reason, I tend towards an Azure Automation approach to starting up and shutting down VMs.

There are a host of methods and processes for tuning VMs (not just in Azure).  You can use most of those techniques with Azure, just having to change how you get access to the underlying metrics you are relying on.  As always, starting and shutting down VMs can save quite a bit of money in the long run.