Blog

Monitoring the Costs of Underutilized EBS Volumes

With the expansion of cloud technology, everyone seems to want to move their workloads to the cloud. And why wouldn’t they? The cloud is flexible, quick to deploy, and, one might assume, cheaper. But mismanagement of your resources in the cloud may be costing you hundreds of thousands of dollars each year.

With the right cloud monitoring tools, however, it’s easier than ever to control cloud costs—and that’s what we’ll look at today. Cloud Insights can not only show you where there’s waste in your infrastructure, but it can also create widgets to show you the amount of money you could save by switching between—by way of example—Amazon EC2 instances.

One of the cloud’s central selling points lies in its flexibility: things can be done very quickly in cloud. You can quickly spin up an Amazon EC2 instance on-demand, right when you need it. But during the setup process for that EC2 instance, you’re offered a number of default settings. At that point, if you aren’t certain of the demands of the workload, you’re likely to choose the default EBS disk type, which is General Purpose (gp2) SSD.

Once your instance is up and running, you go about your business, EC2 instances as a part of your work. But how do you know that gp2 SSD is the correct disk type—the disk that’ll give you the most bang for your buck? Equally as important: how do you track your instances efficiently, without constantly looking at each EC2 instance to see if they’re getting the most from the resources they’ve been allocated?

Take, for example, an EBS gp2 volume. That volume is good for max IOPS of 16,000. An EBS throughput optimized HDD (st1) volume is good for max 500 IOPS. If a workload is running on gp2 and has not exceeded 500 IOPS in the last 30 days, you might want to move it across to st1, at less than half the cost.

The Bottom Line: How Do You Find Underperforming Resources?

Cloud Insights can help you quickly and easily discover information about your environment that will increase efficiency while cutting costs. It shows you the high costs of a poor fit.

Let’s take a look at a Cloud Insights dashboard that both tracks the capacity growth of any cloud volume and offers insight into the volumes where your resources would perform best (and worst). I’m going to walk you through several of Cloud Insights’ functions to show you how to rein in costs with better infrastructure monitoring practices.

On the dashboard, you’ll see an overview of the total capacity across all instance types. This dashboard also shows other categories, including snapshot capacity. You can see that gp2 capacity has the largest volume capacity. It’s unsurprising that this volume type is users’ default choice when creating an EC2 instance.

Capacity of EBS Volume Types

If you set the dashboard in “Edit” mode, you can open the widget in a customizable view. This mode gives you a more complete view of how Cloud Insights widgets work and the type of data they display, but it’s also a great way to gain understanding of how to build your own, custom widgets, and dashboards.

If you set the dashboard in “Edit” mode, you can open the widget in a customizable view.

If you take a closer look at the options in the toolbar in “Edit” view screenshot above, you can see that we’ve selected an asset type of VMDK. This is the common virtual disk object we use in Cloud Insights for VMware, EC2, Openstack, Azure, GCE, and all of the other virtual compute resources that we collect.

Once you’ve selected the VDMK asset, you can choose which type of metric you want to visualize; in this case, we’re showing total capacity. Since the VMDK asset type is common across all virtual compute resources we’re collecting from, we need to take another step to filter down to “Type: EBS” in the dropdown menu.

In the next row of options, you can roll up the “Sum” of capacity for each “Type” and limit the maximum number of types you want to see. In this case, we’ve selected the top 10 types, but we only have 6 different VMDK types to display. In the last row of options, you can choose how this information is displayed (we chose a red bar chart). A pie chart may be another good way to visualize this data, and you can choose that here, too.

Although we’ve only selected a few, you can add as many filters as you need to tightly define what you want to see. For example, you can also select a filter that limits the capacity displayed to volumes containing fewer than 100GB.

Next, we might want to see a historical view of consumption across sc1, st1, and gp2 volume types. In this case, we’ve selected 30 days (we store up to 90 days of data). You can see that gp2 is growing faster than any other volume type in the graphic below.

You can see that gp2 is growing faster than any other volume type

Leaving Notes for Users: Offering Context

In order to provide context for the performance and limitations of each EBS type in your stack, you can also add text to your dashboard. Below is an example of a “note” widget on the dashboard that will help dashboard consumer users to better understand the capabilities and limitations of various EBS types.

The screenshot below shows a note with information drawn from AWS documentation about gp2, io1, st1 and sc1 types; this will better enable dashboard consumers to understand the variables used to create comparisons in dashboards. This note also provides hot links to AWS pages for greater detail, as needed.

Performance recommendations

Note widgets are a great way to enhance your dashboard information, and they provide direction and detail for those less familiar with your setup. This particular AWS dashboard also contains a widget explaining to the user how to adjust the threshold, cost, and time period of the displayed data.

Calling Out Crucial Metrics

You can create “single value” widgets to call out a particularly important metric. In this example, we’ve highlighted the total used capacity of EBS gp2 volumes with a peak IOPS of less than 500, well below the maximum IOPS for this volume type. The obvious conclusion here is that a more cost-effective type could be used in place of gp2. There is 276TB of capacity that simply doesn’t require the IOPS of a gp2 volume, which means you’re overspending on IOPS that you don’t need.

GP2 Capacity Under 500 IOPs

The obvious question here is: how much can I save by rectifying this issue?  Using what we’ve learned about single value widgets, we can create a view of how much we’re spending on this 276TB, and another view of how much we can save by moving these volumes to st1.

Monthly cost of GP2 under 500 IOPs and approximate monthly savings using st1

With Cloud Insights Standard edition, you can add these costs manually by using the “calculation” field in the widget settings. For example, if you’re running workloads in US-East, the total cost for gp2 is $0.10 per GB and st1 is $0.045 per GB. To show the savings figure, you can use the same widget configuration as the one showing capacity above, but enter (A*0.1)-(A*0.045) into the calculation field, thus showing the cost of your capacity in gp2, minus the cost of that same capacity in st1.

Approximate monthly savings

The Cloud Cost feature of the upcoming Cloud Insights Premium edition will allow this cost to be ingested automatically, with no need to handle these calculations yourself.

You can also benefit from a table-style widget that tracks the names of the gp2 volumes, as well as their capacity and their maximum IOPs. That way, you can inspect particular volumes that are large in capacity and low in IOPS using the same filter criteria that you used previously (“Type” is EBS_gp2; “IOPS – Max” is 500).

you can inspect particular volumes that are large in capacity and low in IOPS using the same filter criteria that you used previously

How Long Have You Been Overpaying?

It’s also important to understand how long a gp2 volume has tracked below 500 IOPS. In the example below, we took a closer look at the 30-day performance of a particular gp2 volume. It showed some peaks in IOPS, but always remained under 500 IOPS, which makes it a good candidate to move to an EBS st1 volume.

It showed some peaks in IOPS, but always remained under 500 IOPS, which makes it a good candidate to move to an EBS st1 volume.

You can also use a similar widget configuration to check if the volume size has been consistent over time.

All of this information is displayed in a single dashboard that you can use to convey up-to-the-minute information on both the value and efficiency of your AWS environment.

All of this information is displayed in a single dashboard that you can use to convey up-to-the-minute information on both the value and efficiency of your AWS environment.

Gain Immediate Insight into Your Cloud Infrastructure

If you’re not yet a Cloud Insights user and you would like to give it a try, it’s easy to set up the service, and it only takes a few minutes. Navigate to Cloud Central  to request a 14-day free trial to see where you can reduce your AWS costs today.

-