Autoscaling in Kubernetes, or in any cloud environment, is essential for enterprises as it enables users to automatically add and remove instances to meet the demands of the workload. It’s an effective cloud cost optimization approach, especially for enterprises that rely on Kubernetes for their processes.

However, autoscaling in Kubernetes can result in the opposite of an enterprise’s intentions. Cloud cost optimization solutions designed to minimize waste can instead raise your costs and cause your cloud environment to overprovision resources.

How does this happen? Most application developers request large amounts of resources to be allocated to their apps in an effort to mitigate the worst-case scenarios. What if the input data is 10x larger this time? What if the new code needs twice the memory and twice the CPU this time? Developers often do not have the information to know exactly how much they need to allocate to handle typical worst-case scenarios. Sometimes, they don’t even know how much to allocate to handle typical scenarios.

Developers want their applications to succeed and finish in a reasonable amount of time, even in those worst-case scenarios. This desire forms the basis of their cloud cost optimization approach. That’s why they ask for large allocations. The reality is that such worst-case scenarios happen very rarely. On average, applications use only a fraction of the allocated resources. In fact, some studies show that 32% of a cloud budget goes to waste, in part because of overallocation.

The inefficiencies of cloud autoscalers

Unfortunately, autoscaling is implemented based on resource allocation versus resource utilization. To optimize cloud cost, cloud autoscalers add more instances when the scheduler cannot add more applications to the cluster because all the existing resources have already been allocated. 

Imagine a cluster with two nodes. Let’s say an application requests two nodes. The application may end up using only eight cores, but the autoscaler does not know that.

As more applications are submitted requesting more cores, the autoscaler will add more instances even though the existing instance is only 50% utilized. If new applications also use just a fraction of the allocations, the new instances will be also underutilized. 

The result is many more wasteful instances and, ultimately, inflated cloud bills. 

Pepperdata Capacity Optimizer: Superior cloud cost optimization

Pepperdata Capacity Optimizer takes cloud cost optimization to a higher level. It solves the problem of inefficient cloud autoscalers by enabling the scheduler or cluster manager to schedule workloads based on resource utilization instead of resource allocation, both in YARN and in Kubernetes.

Once the configured resource utilization is achieved, the autoscaler adds more instances. Cloud cost optimization through Pepperdata Capacity Optimizer not only maximizes the utilization of each of the existing instances, it also ensures that the new instances are added only when the existing instances are fully utilized in an autoscaling environment. Pepperdata manages the autoscaling behavior of the cloud platforms so that you don’t have to. 

To optimize cloud costs in both Kubernetes and YARN, Capacity Optimizer does the following: 

  1. New instances added by the autoscaler
  2. The autoscaler waits for Capacity Optimizer to maximize resource utilization 
  3. Capacity Optimizer reclaims wasted resources, so the scheduler launches more apps
  4. If the apps are still pending, autoscaler logic kicks in and goes back to step 1

cloud cost optimization graphic

Similarly, cloud platforms do not downscale the instances even if many instances are near-idle or idle. In such scenarios, if the utilization of a certain number of instances falls below a certain threshold, Capacity Optimizer instructs the autoscaler to downscale. 

Cloud cost optimization through Pepperdata Capacity Optimizer results in fewer instances being used to get the same work done, resulting in direct cost savings to you.

In fact, Pepperdata sees a 30% reduction in query duration and a 35% increase in workload capacity when Capacity Optimizer is enabled on a Kubernetes cluster running a standard big data analytics benchmark. For more details, download our Pepperdata for Amazon EKS datasheet.

Optimizing autoscaling in your cloud cluster

Would you like to see similar results in your cloud cluster? We’d welcome the opportunity to help you achieve them. Please register for your free trial of Pepperdata or email us at any time with your questions at

Explore More

Looking for a safe, proven method to reduce waste and cost by up to 47% and maximize value for your cloud environment? Sign up now for a free Cost Optimization Proof-of-Value to see how Pepperdata Capacity Optimizer can help you start saving immediately.