Continuous Big Data Cluster Tuning

Capacity Optimizer automatically tunes and optimizes cluster resources

Automatically Improves Big Data Cluster Performance without Manual Tuning

Continuously tune and optimize your big data cluster resources. Recapture wasted capacity so you can run more applications and get the most out of your infrastructure investment.

Improve Big Data Cluster Throughput By Up to 50%

By monitoring the entire infrastructure in real time and leveraging active resource management, Capacity Optimizer identifies where more work can be done, and adds tasks to servers with available resources.

On a typical cluster, Capacity Optimizer makes thousands of decisions per second on how to best optimize your big data clusters, increasing typical enterprise throughput by up to 50 percent. Even the most experienced operator can’t make manual configuration changes with the precision and speed of Capacity Optimizer.

Pepperdata’s Big Data Cluster Performance Output

Managed Autoscaling Reduces Cloud Costs

Autoscaling provides the elasticity customers need for their big data workloads, but it can lead to runaway costs. Capacity Optimizer intelligently augments autoscaling to ensure all nodes are fully utilized before additional nodes are created, eliminating waste and reducing costs.

Cloud providers provision infrastructure based on the peak needs of workloads. This guarantees the maximums are met, but there’s a lot of waste inherent in the current method of provisioning. Capacity Optimizer makes thousands of decisions per second, analyzing the resource usage of each node in real time to optimize the utilization of CPU, memory and I/O resources on big data clusters. The net effect is that horizontal scaling is optimized and waste is eliminated.

Try Pepperdata Capacity Optimizer With Managed Auto Scaling

Pepperdata provides automated deployment options that can be seamlessly added to your Amazon EMR, Google Dataproc, and Qubole environments. Try Capacity Optimizer with managed cloud autoscaling. In addition to automatically tuning your cloud deployment for optimal performance. Pepperdata allows you to:

  • Reduce troubleshooting time by 90% by leveraging targeted performance insights

  • Tune application resources for peak efficiency with prescriptive recommendations

  • Automatically detect and alert on bottlenecks that impact SLAs

Capacity Optimizer benefit graphical representation

Doesn’t YARN Scheduler Manage Resources?

YARN (“Yet Another Resource Negotiator”) scheduler leverages the resource management capabilities of MapReduce, coordinating consumption and usage reservations to make allocations. Limited by its conservative assumptions about memory usage, YARN under-provisions resources. In addition, YARN does not monitor containers once they start running or adjust in real time based on actual usage.

Pepperdata Capacity Optimizer overcomes YARN scheduler’s limitations and maximizes resource utilization by monitoring actual per-task hardware usage in real time. Capacity Optimizer dynamically makes adjustments at the process level to eliminate inefficiencies and bottlenecks, and maximize resource usage.

Big Data Observability and Continuous Tuning at ScaleLEARN MORE


laptop user; IT Operators


  • Ensure big data cluster stability and efficiency.
  • Avoid overspending on hardware.
  • Reduce time spent on capacity planning.
  • Run more jobs on existing infrastructure.
IT developers using laptops


  • Run more jobs faster.
  • Access additional cluster capacity.
  • Spend less time in backlog queues.
Enterprise skyscrapers

Enterprise Organizations

  • Report on capacity trends
  • Get accurate chargeback reporting
  • Increase productivity

Achieve Big Data Success

Pepperdata products provide a 360° degree view of your platform and applications with continuous tuning, recommendations, and alerting.