Pepperdata’s ability to halve cloud costs at top enterprises may seem radical and new, but it’s absolutely not. Pepperdata has been hardened and battle tested since 2012, and our software is currently deployed on about 100,000 instances and nodes across some of the largest and most complex cloud deployments in the world. We’re an AWS ISV Accelerate partner focused on helping customers save money running Spark on Amazon EMR and Amazon EKS. And we’ve helped our customers save over $200 million along the way. 

So how exactly does Pepperdata achieve these seemingly amazing results? We leverage the power of optimization at the application framework layer.

App framework optimization screenshot

Stepping back for a moment, let’s think about your FinOps activities to drive cloud cost optimization. You’re probably already doing a lot of these things to corral your cloud costs at the platform or infrastructure level: 

  • Implementing Savings Plans
  • Purchasing Spot Instances and Reserved Instances
  • Manually tuning your platform
  • Enabling Graviton instances
  • Rightsizing instances
  • Making configuration tweaks

… and the list goes on.

In fact, all of Pepperdata’s customers do on average three to five of these things before even coming to us. And they continue to do all these things even after they start working with us. These are platform-level optimizations, and we encourage you to continue to do what you can to ensure your platform is as streamlined as possible.

The Pepperdata Difference at the Application Framework Level

Pepperdata solves a completely different problem: optimization at the application framework level. You might be operating the most efficient platform in the world, but if your applications are overprovisioned, they’re going to use that platform inefficiently. And these inefficiencies can be significant across your batch applications like Spark. On average, typical applications can be overprovisioned by 30 to 50 percent or sometimes more. Pepperdata’s internal analysis of randomly selected customer clusters revealed that only 50 percent of allocated resources were used for 50 percent of the time.

That’s a lot of waste!

What Would You Do With an Extra 30% (K8s version)

Statistic taken from Flexera 2023 State of the Cloud Report

That’s not your fault, nor is it the fault of your developers. It’s an inherent issue in application resource usage. Your developers have literally no choice but to request a certain allocation of memory and CPU for their applications; otherwise their applications get killed. The only resource management tool that your developers have in their toolkit is to reduce their requests and try to get as close to peak as possible.

Waste Due to Peak Provisioning

The reality is that during the application runtime, memory and CPU usage aren’t static; they go up and down. When the application is not running at peak, which might be as much as 80 or 90 percent of the time, that’s waste that you’re paying for. 

That means your developers allocate the memory and CPU they think their applications will need at peak, even if that peak time only represents a tiny fraction of the whole time the application is running. The difference between the developer allocation and the actual usage is waste, which translates into unnecessary cost. 

Optimization level graph

Another issue is that allocating memory and CPU is not a “one and done” operation. The work profile or data set might change at any time, requiring allocations to be adjusted accordingly. Running your applications then becomes a neverending whack-a-mole story of constantly tuning and re-tuning to get close to actual usage.

What’s worse, the scheduler doesn’t even know the allocated resources are not being used. Your developers have requested a certain level of resources, and the scheduler gives them what they asked for. As far as the scheduler is concerned, you asked for this level of resources, so you must be using that level.

As a result, the system looks fully saturated and unable to take any more workload. The scheduler is not aware that applications may very well not be using all those requested resources at all times.

So when the system looks fully saturated and more applications come along, the scheduler has only two options:

  1. The scheduler can put the new workloads or applications into a queue or pending state until resources free up.
  2. The scheduler can enable the autoscaler to kick in to spin up new instances at additional cost, even though your existing resources are not fully utilized. You end up paying for resources you don’t need, or decreasing your job throughput unnecessarily.

Pepperdata solves this problem by changing the equation. Pepperdata provides the scheduler with real-time visibility into what’s actually available in your cluster, second by second, node by node, instance by instance. Pepperdata removes the blinders from the scheduler and provides it with a real working picture of what’s going on.

Pepperdata uses machine learning to intelligently inform the scheduler which instances can take more workload, dynamically and in real time. It’s real-time FinOps. If there’s a terabyte of waste in the cluster, for example, Pepperdata will scale the size of the cluster up in real time by a terabyte in response to this waste, knowing that more resources are available. As a result, your cluster is no longer bound by allocations

Savings Power at Your Fingertips

And all of this is configurable. If your goal is 85 percent maximum utilization, for example, assuming the workload is there to achieve that utilization rate, you can choose your level of optimization so that Pepperdata will automatically and continuously work with the scheduler to always hit 85 percent utilization on a node by node basis. Pepperdata also has the intelligence to back off as the utilization gets close to the target you set.

Bottom line, by optimizing at the application framework layer, Pepperdata delivers you higher utilization, better parallelism, greater throughput, and reduced cost in your clusters. Our customers on average are enjoying 30 percent and up to 47 percent savings for Spark workloads on Amazon EMR and Amazon EKS automatically. And that’s across some of the largest and most complex and highly-scaled clusters in the world, including customers in the Fortune 10.

Customer Successes in a Block

It’s Easy to Get Started

waste assessment thumbnailYou don’t need an engineering sprint or a quarter to plan for Pepperdata. It’s super simple to try out. In a 60-minute call we’ll create a Pepperdata dashboard account with you.

 Pepperdata is installed via a simple bootstrap script into your Amazon EMR environment and via Helm chart into Amazon EKS. You don’t need to touch or change your applications. Pepperdata deploys onto your cluster, and all the savings are automatic and immediate, with an average savings of thirty percent. It’s totally free to test in your environment. 

We want to make it easy and risk-free for you to try out Pepperdata in your own environment. Pepperdata offers a free 2-day Savings Assessment that helps you visualize exactly how much application waste you still have in your clusters, even after all your manual platform optimizations. 

Your Customized Savings Assessment will contain:

  • Total estimated waste in terms of memory hours, core hours, and instance hours
  • Top 10 most wasteful queues by memory hours, core hours, and instance hours wasted
  • Estimated savings from running Pepperdata Capacity Optimizer in your environment

To get started, visit us at or drop us a note at We look forward to helping you extract the most value out of your cloud environment!

Explore More

Looking for a safe, proven method to reduce waste and cost by up to 47% and maximize value for your cloud environment? Sign up now for a free savings assessment to see how Pepperdata Capacity Optimizer can help you start saving immediately.