At Pepperdata we are all about performance. Our team has over 190 years of combined experience with distributed systems engineering and performance, and we are passionate about making things run faster (and frankly, better).

To help customers achieve the highest performing distributed systems possible, Pepperdata had to build our own high performance infrastructure to support this vision. For example, the Pepperdata Adaptive Performance Core measures nearly 300 metrics per task, and one of our large customers has a 4,000 node cluster, with each node running 40 tasks. If we sample each metric at a five-second interval, Pepperdata generates over 400 million data points each minute, almost 600 billion per day. That’s a lot of data, and that’s only from a single cluster!

With this high volume of data, our team had to overcome significant performance limitations in today’s technology. To tackle this, we chose OpenTSDB as the framework to build out the data ingestion and processing framework for the Pepperdata hosted dashboard. With some ingenuity and extensions to the open source technology, Pepperdata has achieved unprecedented performance levels, as seen in the chart below. Based on publicly available performance numbers for comparison, the Pepperdata dashboard can handle nearly 6X more data points per second, per server than any comparable OpenTSDB benchmarks. We can also render plots of a full day’s worth of data in a single second. Our work with Global 100 customers on large, mission-critical Hadoop deployments has led us to provide this unprecedented visibility into distributed computing environments.

Pepperdata software is the only solution that can anticipate and avert cluster performance issues such as swapping, busy disks (I/O) and network problems. Our engineering achievements in data volume and speed as well as our product innovation are driving big changes for our customers but sometimes they need direct support from one of our performance experts as well.

That’s why customers get so excited about our unique support model, which leverages our hosted dashboard and allows us to work directly with customers to solve their most challenging performance problems. The Pepperdata engineering and support teams live and breath performance every day, making it an extremely rare occurrence that we run across a performance problem that we haven’t seen before. While we do get the occasional feature or bug support questions, we get more excited when our customers ask us to help them solve tough (seemingly undiagnosable) problems in their production environments. The Pepperdata hosted dashboard is a great foundation for collaboration to help customers solve issues such as:

  • My users are complaining that their job ran slower than expected last night, but we can’t pinpoint why.
  • Our production job is running slower, but the data set didn’t change, so what could the cause be?
  • We noticed that during a certain timeframe our cluster performed very poorly.

These types of problems are commonplace across multiple environments and clusters, and our team typically knows exactly what the underlying cause is. In particular, new customers may not know where to go and look to solve the problem, and typically benefit from a little help from our performance experts. Often this collaboration is as simple as a customer sending a link to their dashboard and asking us to take a quick look. We can do quick, but powerful analysis on the data and send back a slightly modified query highlighting the source of the problem, with explanations and often recommended fixes.

In our effort to help customers achieve the best performance possible, we also proactively monitor for performance issues across all of our customer instances so that we can quickly spot and notify customers about common problems.


At Pepperdata we are constantly improving and upgrading our dashboard to help customers more efficiently manage their Hadoop environments. And we frequently treat our customers to new reports, metrics and charts when they log in. For us, support means so much more than troubleshooting technical issues or bugs; we are 100% committed to helping our customers achieve the best performance possible and pride ourselves on being the distributed computing performance experts.