TechEx North America 2022’s AI & Big Data Expo, America’s leading enterprise technology exhibition, concluded with incredible insights across ML, cloud infrastructure, and big data analytics industries. Showcasing expert panel discussions, live demos and high-level presentations from over 250+ speakers, the convention highlighted five key takeaways — some innovations, yet some gaps in terms of big data progression.

1. Machine learning increasingly used in big data infrastructure optimization

For your modern big data stack to be operational with maximum efficiency, several facets need to be measured: applications, workloads, users, data, resources, and containers. But how optimized can your infrastructure be with constant manual tuning from your enterprise’s IT team?

 If your infrastructure is in the cloud — which is likely with 94% of enterprises using a form of cloud service — then you need to find how to optimize costs, allocate and utilize your resources, and continuously tune your applications. With all this being done in parallel, the chance for your ITOps team to prioritize innovation and scaling gets put off. This is why ML has become critical to analyzing big data: In order for your enterprise to effectively understand its big data insights and grow off of them, it needs automation to find those insights.

2. Observability is necessary but not enough

Big data is complex. Insights are there within your applications, containers, and infrastructure, but they might have been missed if your DevOps team relies solely on monitoring. Traditional monitoring collects metrics and can provide you with information of whether a system is working or not. However, it doesn’t answer the crux of the problem and the reason behind the issue — it is solely reactive, rather than getting down to the root of the problem. With that, autonomous observability is needed for continuously optimizing resources and troubleshooting as soon as a problem arises. Without constant observability, prices can skyrocket, insights can be lost, and your throughput performance will never achieve its full potential.

3. Recommendations and insights are leading to analysis paralysis

Say you have legit observability tools and your team compiles insights from your big data infrastructure: But how are you acting on them? Resource waste is going to be common and you may have found that insight, but it’s not productive to manually optimize those resources. Your ITOps team might have also observed peak times when more compute resources are needed, forcing you to allocate. In order to properly scale your infrastructure, it’s more beneficial to actionably utilize resources versus allocate resources. Rather than schedule certain tasks to happen at certain times, you can utilize the compute resources you already have — enabling optimization. For efficiency to be maximized, automation is needed to minimize overprovisioning from your team.

4. Wrong instance sizing

With the growth of big data and the increasing amount of insights enterprises now have, processes need to be broken down into modules for proper scaling. Enterprise-level applications are going to rely on a big data infrastructure, so having loosely coupled modules allows for cleaner organization with each module having a direct purpose. Deployments can be quicker, troubleshooting easier, and throughput increased with the flexibility amongst microservices.

The advantages of microservices may appear clear and proven, but horizontal observability is key to monitor and develop actionable insights for how your applications are running. Joydipta Chakraborty, VP Data Engineering, JPMC, talked about how microservices are a good design pattern to decouple monolithic big data workloads for performance and efficiency. With a resource like Pepperdata Application Spotlight, you get a comprehensive view of your entire application cluster and data in real time. Your ITOps team can then deliver the best big data application performance by utilizing the automated recommendations provided by the platform.


5. Observation, optimization, remediation: Autonomous FinOps

Automation is needed for enterprise applications and infrastructures to be properly observed, optimized, and ultimately remediated. Through Pepperdata’s autonomous FinOps platform, organizations’ developers and IT teams are granted the freedom to innovate and scale versus staying burdened with troubleshooting and fine-tuning. Along with its Apache Spark on Kubernetes solutions, your containers can stay continuously optimized, cloud costs significantly reduced, and instances rightsized.

Join us at KubeCon + CloudNativeCon North America 2022

Your Kubernetes containers provide ease of deployment and resilience to your infrastructure, but how are you dealing with capacity complications, layer monitoring, and unexpected cloud costs? If you’re looking to see how to autonomously optimize your Kubernetes containers, come talk to the Pepperdata team at KubeCon 2022! Be sure to stop by Pepperdata’s KubeCon Booth #S77 to discuss Autonomous FinOps with our cloud native experts. 

Register now

Take a free 15-day trial to see what Big Data success looks like

Pepperdata products provide complete visibility and automation for your big data environment. Get the observability, automated tuning, recommendations, and alerting you need to efficiently and autonomously optimize big data environments at scale.