This week, we were thrilled to announce our latest feature: comprehensive chargeback reporting. Chargeback reporting allows organizations to view the costs of using shared cluster resources — like memory or CPU — over a specified period of time.
Our new feature is unique in the Hadoop marketplace: we aren’t limited to measuring data storage at rest (i.e., how many terabytes individual departments are storing on disk); instead, we report on the total memory and CPU used during any desired window in time, giving an unmatched, granular view into consumption and costs.
How to get it
Chargeback reporting is now generally available to all Pepperdata customers at no additional charge and with no extra installation or configuration required. Simply navigate to https://dashboard.pepperdata.com from any internet-accessible device, enter your username and credentials, and navigate to the Reports > Chargeback dashboard page.
New customers will be able to view chargeback reporting immediately once their dashboard is configured and sufficient time — typically a few minutes — has passed after jobs and applications have been launched on the cluster.
Pepperdata’s chargeback reporting measures CPU and memory usage using CPU core-hours and memory byte-hours, which essentially compute the amount of CPU and memory used multiplied by the total time the resources were consumed. This is analogous to how electrical bills are calculated, where the unit of measure is watt-hours (100 watt-hours could mean 20 watts consumed for 5 hours, or 1000 watts consumed for 1/10th of an hour — that is, 6 minutes).
Note that Pepperdata software for Hadoop supports all major Hadoop distributions, including Cloudera, Hortonworks, and MapR.
What you see
Our chargeback reporting uniquely provides detailed visibility into CPU and memory usage over a time interval ranging from minutes to weeks. This allows you to better assess trends; group cluster costs by user, job, or queue to pinpoint top resource consumers; and further filter the data by variables like job number for more detail. Individual columns in the cost table can be sorted in ascending or descending order, and individual cells are color-coded to highlight the heaviest usage and costs.
You can quickly identify which users or jobs are consuming the most memory and CPU, and what this is costing on a per-user, per-job basis. And all of this information can easily exported to CSV with one click, allowing further analysis in Excel or program of your choice.
Chargeback reports in the Pepperdata dashboard show costs and consumption of shared resources.
In a nutshell, Pepperdata’s chargeback reporting lets organizations with multi-tenant or multi-workload Hadoop deployments easily view consumption by user, job, or department to quickly learn which are the most costly and resource-intensive, calculate ROI, and make more strategic decisions around IT investment.
Pepperdata is the leader in real-time cluster optimization. Our software brings predictability to multi-tenant, production Hadoop environments for the first time. While Hadoop and YARN focus on getting jobs running on the cluster, Pepperdata monitors and controls hardware usage in real time so that customers can meet SLAs, increase throughput, and improve visibility.
Many of the world’s largest companies trust Pepperdata software to make thousands of decisions a second to ensure that their applications get the performance they need. With Pepperdata, there’s no more manual tweaking, tuning or over-provisioning in a futile attempt to guarantee performance. The software installs in minutes, runs on your existing clusters, and is comp