Do your production jobs sometimes finish on time, sometimes not?
Do ad hoc jobs run rampant on your cluster, slowing down critical work?
Do you have to run separate clusters just to keep production jobs safe?
Pepperdata changes all that.
Pepperdata brings an unprecedented level of control to Hadoop, letting you ensure that your critical production jobs safely complete on time, while making any unused capacity available to your users for their other jobs.
Get up and running in just an hour, on any size cluster with any standard Hadoop distribution.
Hadoop today (whether Hadoop 1 or Hadoop 2 with YARN) does not continuously monitor and control each job's usage of key resources like CPU, memory, network, local disk, and HDFS. The result is that a single poorly written job can dominate the cluster, slowing down critical production jobs and causing missed SLAs.
Pepperdata dynamically controls the resources used by every job at every moment, automatically ensuring that your most important jobs run safely and reliably, no matter what else is running on your cluster.
With Pepperdata, you can increase throughput and reduce total cost of ownership, while ensuring the safety of production jobs. Pepperdata improves cluster performance so that you save on hardware and get simplified operations, happier users, and less time spent debugging problems.
Pepperdata ensures that critical production jobs get the resources they need, so you can fully use your cluster's hardware without worrying about whether poorly written jobs affect your production SLAs.
Pepperdata tracks the real time use of CPU, memory, network, local disk, and HDFS by every user, every job, every task, every process. You can see and analyze aggregated and granular cluster resource usage, knowing what the limited resources are at any moment, and who's using those resources.
Then quickly close the loop by easily setting policies to give your most critical jobs what they need to complete on time, every time.