Pepperdata Capacity Optimizer automatically optimizes your cluster resources, recapturing wasted capacity so you can run more applications and get the most out of your infrastructure investment. Capacity Optimizer enables you to:
On a typical cluster, Capacity Optimizer uses machine learning (ML) to make thousands of decisions per second, analyzing the resource usage of each node in real time. The result: CPU, memory, and I/O resources are automatically optimized to increase utilization, and waste is eliminated in both Kubernetes and traditional big data environments. Capacity Optimizer rapidly identifies where more work can be done and adds tasks to nodes with available resources. Even the most experienced operator dedicated to resource management can’t make manual configuration changes with that precision and speed.
In cloud environments, autoscaling provides the elasticity you need for your big data workloads, but it often leads to uncontrolled costs. Cloud providers provision infrastructure based on the peak needs of workloads. This guarantees that maximums are met but can create a lot of provisioning waste—the very waste that Capacity Optimizer identifies and returns to you in the form of optimized, available resources to run more jobs.
Whatever your cloud platform, Capacity Optimizer uses autonomous optimization to intelligently augment autoscaling and ensure that all nodes are fully utilized before additional nodes are created. The net effect is that horizontal scaling is optimized and waste is eliminated.
Capacity Optimizer complements traditional EMR autoscaling by reducing resource waste on your cluster before EMR autoscaling is enabled. On top of Amazon EMR, Capacity Optimizer can reduce the number of cores by up to 63%, active nodes by up to 67%, and CPU idle time by up to 30%. Capacity Optimizer is part of the Pepperdata product suite available for free on AWS Marketplace. Pepperdata for EMR allows you to:
Pepperdata products provide complete visibility and automation for your big data environment. Get the observability, automated tuning, recommendations, and alerting you need to efficiently and autonomously optimize big data environments at scale.