There was an interesting question on Slashdot this week regarding capacity planning and performance management. The reader makes a good point – in the old days of mainframes and static data, capacity planning was pretty straightforward because you didn’t have  multiple technologies, platforms, and data types to contend with. Back then, capacity planning was focused on infrastructure cost optimization. Today, planning is based more on performance (although cost optimization is still a nice benefit).

Organizations are now faced with a double-edged sword: there is immense competitive and monetary benefit to be had in unlocking the insights of Big Data.  However, doing so requires delving into a complex ecosystem of technologies that are notoriously difficult to navigate and employ. So how do organizations cope?

When it comes to Hadoop (the de facto platform for most Big Data projects), most organizations attempt to address their capacity challenges through manually tuning or by simply adding more hardware — setting up separate clusters for their mission critical workloads that must operate on a deadline. This can be a short-term fix, because it does allow organizations to run multiple jobs across clusters in an attempt to guarantee critical jobs complete on time. However, it’s also a blind fix in many ways, since Hadoop operators aren’t getting any more
visibility into what is happening on their clusters, nor are they improving their control over which jobs and users get highest priority. And lastly, organizations are wasting valuable capacity because each cluster is under-utilized (although, without granular, second-by-second visibility, most users wouldn’t even know this).

One of the most interesting benefits of Pepperdata is not the real-time visibility that our dashboard provides but the automatic optimization that our software is performing. With Pepperdata, operators and admins don’t need to be monitoring every aspect of the cluster and manually tuning — a good thing, since this is something no human could possibly do in real time.

With Pepperdata, you set policies to prioritize specific jobs and applications (or user workloads), and the Pepperdata supervisor and agents running on each node to constantly monitor and control the actual use of (and demand for) each kind of hardware resource by each task in real time. This allows Pepperdata to identify “holes” in the cluster where a node could temporarily do
more work, fill those holes with additional tasks, and all the while ensure the cluster continues to operate safely and reliably. The software is making those decisions in real time – often thousands of decisions every minute – and this increases the cluster performance up to 50%.

Relying on human intervention is never going to be enough, but with Pepperdata, you are able to improve throughput on your existing infrastructure, gain deep, granular visibility into your cluster, and enforce SLAs on critical applications.