Most companies today rely on Hadoop for their big data processing needs. However, such a sophisticated platform can still present pitfalls and challenges that DevOps teams grapple with almost every day.
These challenges can usually be summed up into three main issues. Continue reading to find out what these are, and see how Pepperdata takes care of these problems.
Challenge #1: The Difficulty in Finding Root Cause of Problems
Picture this: a production job is running late, and users are complaining. And so, you check the monitoring system, asking yourself, “Is a node down? Is a cluster busy?”
But you can’t tell what’s happening. So you ask around for anyone who might have done anything differently. No one has anything to offer because, most probably, everyone else is as clueless as you are. You can’t even replicate the problem, much less find what started it. And so you decide to let Hadoop operations run while you watch and wait, hoping and praying that the next time the problem reappears, you’re there to witness it.
Unsurprisingly, the problem resurfaces. Hurrah! Now you can jump in to do some node debugging. Only… you still don’t see where the problem is coming from. You’re not even sure which resources to tweak. Still, you make adjustments here and there, then you sit back and play the waiting game again. Eventually, you decide you’ll need to buy extra hardware just to buy yourself some peace. But really, in the back of your mind, there’s a nagging worry that this will happen again.
Now did that feel like a day in your life? It probably did.
But with Pepperdata solutions, you get visibility from all corners. Every five seconds, PepAgents collect more than 350 real-time operational metrics from applications and infrastructure resources, including CPU, RAM, disk I/O, and network usage metrics on every job, task, user, host, workflow, and queue.
The user doesn’t need to sift through this data; Pepperdata swiftly derives insights from the metrics to solve the problem at hand. This gives you full 360-degree visibility into application performance within the context of your big data infrastructure, whether on-premises or in cloud/hybrid environments.
Challenge #2: Inefficient Cluster Utilization
Let’s face it: production clusters struggle to fully utilize their full capacity given the constraints of the YARN scheduler. No matter how tuned your Hadoop is, these clusters are sized for peak SLA, and with Hadoop’s allocations being predefined and static, a lot of computing capacity gets wasted.
And what about ad-hoc jobs? They sit in their own production cluster queue, until someone runs a rogue job, using more resources than a given queue can handle, and causing other jobs to run late.
The more jobs and workloads you have, the more clusters you set up to isolate them. More clusters = more wasted capacity.
Not with Pepperdata Capacity Optimizer.