Full or partial cloud migration has been all the rage for a few years. The worldwide public cloud services market is predicted to grow 17% in 2020, to a total of $266.4 billion. According to Gartner, 35% of CIOs are decreasing their investment in their infrastructure and data center, while 33% are increasing their investment in cloud services or solutions. But there’s a catch. Gartner also predicts that:

“through 2020, 80% of organizations will overshoot their cloud IaaS budgets due to a lack of cost optimization approaches.”

When big data cloud migration is still in its early stages, it all seems easy-going. Operating a data center in the cloud is always cheaper than on dedicated on-premises servers. But, eventually, when enterprise IT organizations receive their first few cloud bills, they are often shocked and puzzled. Compared to their legacy stance, they suddenly cannot understand what they are spending on or why.

Big data cloud invoices can add up to hundreds of thousands more dollars than expected. When Bain & Company asked more than 350 IT decision-makers what aspects of their cloud deployment had been the most disappointing, the top complaint was that the cost of ownership had either remained the same or increased.

Why the lack of cost optimization approaches? Because IT operations are in a visibility crisis:

  • Complex multi-cloud environments are becoming commonplace, and billing details vary dramatically by provider.
  • Many I&O (infrastructure and operations) teams are operationalized for traditional data center principles rather than cloud IaaS, and they lack the organizational processes to manage costs in the cloud.
  • There are many options to address cloud expense management. As a result, I&O leaders may struggle to align these options with the organizational cloud strategy.

In order to optimize costs, IT operations require full-stack visibility to optimize application performance, support SLAs, uncover infrastructure inefficiencies, and minimize MTTR (mean time to repair).

What makes this a daunting task, however, is the sheer size of modern big data clusters, running to thousands of nodes. Add to that the juggling act between making Spark applications run faster, stopping Hadoop clusters from blowing up, and dealing with any malfunctioning workload as quickly as possible.

Enter Pepperdata Capacity Optimizer.

With this Pepperdata solution, we take a unique approach to right-sizing by identifying wasted, excess capacity in big data cluster resources. Pepperdata is built for IT operations teams who need visibility to recapture wasted resources and deploy capacity to maximize current infrastructure. 

By monitoring cloud and on-premises infrastructure in real-time, including hardware and applications, and leveraging machine learning with active resource management, Capacity Optimizer automatically re-captures wasted capacity from existing resources and adds tasks to those servers. To prevent the overprovisioning of your cloud deployments—whether they’re on Amazon EMR, Dataproc, or Qubole—Capacity Optimizer uses managed autoscaling to ensure all nodes are fully utilized before additional nodes are created, eliminating waste and reducing costs.

Interested? Read our whitepaper here to learn more about how you can meet this challenge and put your cloud budgets back in line.