Rightsizing IaaS and PaaS on Microsoft Azure is one of the most important activities you can undertake when deploying your environment into Microsoft Azure and its something I’ve seen not get the attention that I think it warrants. It’s incredibly easy to oversize within Microsoft Azure, its so scalable, you can deploy servers with terabytes of RAM and hundreds of cores… but of course just because you can doesn’t mean you should! While I use example above, going from a single vCPU to a dual vCPU can cause your running costs to double in some instances, do you really need that additional vCPU, do you really want to pay more than you need to?
I’ve spoken to folks at Microsoft about this and they are really keen to get the message across, while their revenues will be driven by increasing consumption they realise that for long term growth, a platform like Azure needs to incorporate value for money as a key element.
I think rightsizing is often overlooked for a variety of reasons and the points below are examples that I’ve seen myself, though I’m sure there are others…
- It’s a consequence of moving from on-premise single tenant hardware, be that a physical server running a single server, or a hypervisor running multiple VMs.
- Baseline performance of existing or new applications is not available.
- Safety first approach, go with something bigger than you think you’ll need.
It has worked fine until now, why change it…?
On my first point above, I think at times the approach of moving to the cloud is seen as a lift and shift operation, you have a physical server of size X on-prem, let’s deploy an Azure VM of the same size. The same approach applies to VMs as well, we’ve deployed a VM with 4vCPUs and 16GB of RAM, lets do the same for the replacement VM we’ll deploy in Azure. I think this approach is wrong, the sizing approach taken previously when deploying internally doesn’t automatically map to deploying in Azure. Keep in mind that Azure runs on some of the latest hardware available, so going with 4vCPUs because your VM previously had that on hardware which is now 5 years old doesn’t make sense.
Not all workloads are the same…?
My second point I think is an important one, and by baselining I don’t mean selecting a VM size that matches what the vendor recommends, I actually mean baselining the application in question, understand it, is it CPU heavy? Is it IO heavy? Does the application have a busy period and sits idle for the rest of the day or week?
you’ll see better CPU performance from a D or F series VM with half the number of vCPUs as an A series
There are many tools out there that you can use to baseline, on Windows perfmon is an excellent tool for baselining and I’ve used Dell’s DPACK software too. Remember if you’re properly architecting a cloud infrastructure you should understand the needs of your applications and size accordingly. Not all VMs created in Azure are equal, different series of VMs will offer different performance profiles, A series VMs are entry level VMs with good general performance but you’ll see better CPU performance from a D or F series VM with half the number of vCPUs as an A series. I’ve posted before about the various Azure VM sizes available, get a quick overview here.
One of the keys takeaways here though is that you can start with one VM size and easily move to another, scaling up or down. Again, pick the VM that offers you the best value for money, don’t deploy a VM with a 2 vCPUs and 8GB of memory if 4GB is known to be sufficient, you’re literally paying for something you’ll never use.
We’re stuck aren’t we…?
My third point is closely related to my second, I think VMs are oversized because before public cloud and elastic scalability it was often the approach to size servers with head room for growth, this in itself wasn’t a bad approach but it doesn’t sit well in a public cloud environment. Remember, in the case of Azure its Microsoft that have to provide the scalability and capacity, not you, you don’t need to oversize to guarantee your application will still run smoothly in a years time, as demand grows so can the scale of your VMs. Resizing a VM takes minutes, why pay for a VM that is twice as large as you need now when you only need that capacity in a years time, scale up in a year when you need it.
Reviewing what you’ve already deployed…
While I’ve mostly spoken so far about rightsizing before you initially deploy, you can easily resize existing VMs deployed, Azure Advisor provides a quick and easy way to identify deployed VMs that are potentially oversized. Its not perfect but its a start, keep in mind that it only looks at CPU utilisation as a guide to identifying potentially oversized VMs, so if you’ve an IO heavy application that isn’t particularly CPU heavy it could be flagged as oversized when in fact the size is required to get the IO you need. I think Azure Advisor is a good start but understanding your applications and workloads is key. If you’re serious about rightsizing you should prioritise this.
Before you think about scaling down understand what the various sizes and series of VMs offer you, remember smaller sized VMs don’t support as many data disks, IO performance can be reduced and some series VMs don’t support premium storage.
If you’ve never reviewed the sizes of VMs you’ve deployed on Azure you should, you should look at utilisation, you should understand what you’re paying for and you should make sure you get the best value for money you can. Azure Advisor provides the ability to export findings to a CSV or PDF, so even if you don’t have access to the Portal yourself you could ask for a report to be produced (it takes seconds to generate).
What about PaaS?
While I’ve focused mainly on VMs in this post, keep in mind rightsizing applies to lots of Azure services, Azure SQL for example, are you using an elastic pool? Is it oversized? Are you paying for capacity you don’t need? Its so easy to scale pool capacity with no downtime needed, you should be ensuring you aren’t spending money on capacity you don’t need know when you could easily make it available within seconds when it is needed.
Have you rightsized?