As a capacity manager, you’re responsible for managing costs and growth as well as driving business by helping developers succeed with their applications. Both the applications and the platform must to be high-performing, and developers must understand the aspects of the platform that affect their applications. It’s one thing to get a platform up and running and allow users to run applications; it’s another thing to make truly effective decisions and changes.


Capacity managers are in a very unique position because they’ve got a responsibility to the business to ensure that users are satisfied and the business is benefiting from optimal performance and capacity. With your APM solution, you should be able to answer yes to the following questions:

  • Can your platform go faster?
  • Are users constrained in any way?
  • Do your business priorities align with application performance?
  • Are you getting all the correct data to answer performance questions?
  • Is your solution helping you to maximize your big data resource investment?
  • Are you able to receive alerts about potential problems before they happen?
  • Can you provide self-service access so that users can check their own application status?
  • Are there any drives that are about to die?
  • Who’s blowing up the cluster?
  • How can I run more applications?
  • Why does YARN say it’s full when I know I have capacity?


To answer these questions and leverage your existing investment, you need an APM solution that:

  • Improves throughput, uptime, efficiency and performance in a multi-tenant environment.
  • Provides comprehensive reporting for accurate capacity planning.
  • Recaptures wasted resources.
  • Identifies the users and applications that are putting the biggest demands on your cluster.

A comprehensive APM solution collects all relevant data, metrics on both application workloads – user behaviors, errors, response times, API calls, etc.– and the environment–resource utilization, data sources, etc.–to obtain accurate and useful insights. By measuring and tracking user transactions, you can understand how applications behave and whether SLAs are being met. Environment measurements help you identify patterns in resource usage and capacity demands. The goal of measuring and analyzing these is to deliver an excellent user experience and get the most out of your infrastructure.

Pepperdata: Massive Scale APM

Pepperdata provides a complete view of your entire cluster so that you can uncover performance problems and identify patterns that impact the entire application environment and make intelligent resource decisions. Pepperdata continuously collects extensive data on hundreds of real-time metrics from all of your applications and infrastructure resources — metrics about CPU, RAM, disk I/O, and network usage for every job, task, user, host, workflow, and queue. This data is not available with any other tool or solution. Pepperdata helps you understand and improve platform performance, reduce mean time to problem resolution and increase capacity utilization by 30-50% without adding new hardware. In addition to surfacing performance bottlenecks, Pepperdata provides automatic tuning for recurring applications, delivers app-specific recommendations and allows you to set up alerts on specific behaviors and outcomes to avoid the risk of failure.

360° Platform View

You need a holistic source of operational and performance truth across your clusters. Pepperdata allows you to access real-time and historical data about the cluster, including system demand, abusive users, and wasteful applications, queues, and container sizes. Drill down or zoom out to analyze any application and understand its performance in the context of the entire multi-tenant cluster. With complete instrumentation of your Big Data ecosystem, Pepperdata allows you to monitor any process (Kafka, Impala, HBase, Hive, MapReduce, Tez, Spark, IBM BigSQL, LLAP) to help you quantify the impact that applications have on the cluster as well as the impact that the cluster has on applications.

Platform Tuning

Pepperdata tracks actual hardware resource usage of every node on the cluster in real time, helping to improve performance and efficiency of your cluster by maximizing existing resources, auto-scaling cluster resources for peak efficiency and rightsizing resource allocation based on real-time capacity. This allows more tasks to be run on nodes that have available free resources at the moment, increasing capacity without adding hardware.

Platform Recommendations

Pepperdata generates targeted recommendations to improve application performance, highlights applications that require attention, automatically identifies bottlenecks, raises issues around duration, failure conditions, and resource usage. In addition, Pepperdata helps you accurately forecast resource needs by identifying growth trends across groups and users.

Platform Alerting

Reduce time to problem resolution using comprehensive and detailed performance data. Leveraging its vast collection of real-time hardware metrics, Pepperdata allows you to create custom alerts and queries and filter and compare by different dimensions to quickly identify what resource or app is causing a problem and which user submitted it.

Insights through Reports

Pepperdata generates comprehensive planning reports containing trends and patterns based on real-time and historical data about the cluster so that you can plan capacity for predictable performance. Use these reports to accurately plan for growth, identify which users and applications are wasting resources, and attribute accurate costs to specific users and business units.

The Bottom Line

Pepperdata Platform Spotlight provides you with a single source of truth so that you can quickly diagnose performance issues anywhere in the environment and make resource decisions based on user priorities and needs. In addition to a comprehensive view of your big data platform, Pepperdata offers proven experience on hundreds of clusters at Fortune 1000 companies. As a trusted advisor helping enterprises establish and follow best practices, Pepperdata can provide you with guidance on the best architecture using real-world experience derived from some of the world’s biggest clusters. Tap into our experience and expertise to achieve more uptime, better performance, improved capacity planning, faster case resolution, and proactive issue avoidance and prevention.

Contact today and learn how we can help you.

More info:

  • Watch the webinar on this topic.
  • Register for our next webinar, Operations Manager Q and A – Do More with Your Big Data Platform (October 24, 11 am Pacific)

Request a trial to learn more about how Pepperdata can help you optimize application performance.

Explore More

Looking for a safe, proven method to reduce waste and cost by up to 50% and maximize value for your cloud environment? Sign up now for a 30 minute free demo to see how Pepperdata Capacity Optimizer Next Gen can help you start saving immediately.