June 29 2021, 10:00 AM PT
Drive Cloud Performance on Amazon EMR with Managed Autoscaling
While autoscaling provides the elasticity that customers require for their big data workloads, it can lead to exorbitant runaway waste and cost and management complexity. Estimating the right number of cluster nodes for a workload is difficult; user-initiated cluster scaling requires manual intervention, and mistakes are often costly and disruptive. Learn about the operational challenges associated with maintaining optimal big data performance in the cloud, what milestones to set, and recommendations on how to create a successful cloud migration framework.
July 3, 2021, 10:00 AM PT
Kafka Performance: Best Practices for Monitoring and Improving
This webinar discusses best practices to overcome critical performance challenges for Kafka data streaming that can negatively impact the usability, operation, and maintenance of the platform, as well as the data and devices connected to it. Topics include: Kafka data streaming architecture, key monitoring metrics, offline partitioning, broker, topics, consumer groups, and topic lag.
July 20, 2021, 10:00 AM PT
Presto Performance Best Practices—Get Visibility Into Your Presto Queries
The complexity of managing cluster performance to meet business requirements and performance SLAs is complex. You want best-in-class performance to meet the short deadlines of interactive workloads while reducing and/or controlling costs. If a single query fails to complete because of query-level inefficiencies, data skew, missing or old statistics, or resource configurations—that single resource-consuming query can negatively impact the entire application stack on that cluster. Join Pepperdata Field Engineer Alex Pierce for this webinar on Presto performance management best practices.
June 15 2021, 10:00 AM PT
Big Data Self-Service Performance Analytics: Best Practices
Big data self-service analytics is the solution to two critical issues: the proliferation of data and the subsequent shortage of data scientists to capture, manage, and analyze it all. Watch this webinar to understand why more organizations are moving to the self-service analytics model and how to more easily create elastic Hadoop, Spark, and other big data clusters for dynamic, large-scale workloads, and learn the best practices for cost optimization of big data workloads.
June 8, 2021, 10:00 AM PT
Fix Spark Performance Issues Without Thinking Too Hard.
This discussion explores the results of analyzing thousands of Spark jobs on many multi-tenant production clusters. We will discuss common issues we have seen, the symptoms of those issues, and how you can address and overcome them without thinking too hard. Based on analyzing the behavior and performance of thousands of Spark applications and use case data from the Pepperdata Big Data Performance report, Heidi and Alex will discuss key performance insights. Topics include best and worst practices, gotchas, machine learning, and tuning recommendations.
June 1, 2021, 10:00 AM PT
Impala Performance Best Practices—Get Visibility into Hive and Impala Queries
Get insights into Impala performance best practices to get visibility into all of your Hive and Impala queries in one place with continuous, automated application and infrastructure tuning. Learn how to get visibility into all of your Hive and Impala queries in one place with continuous, automated application and infrastructure tuning. Understand how to immediately improve and scale application performance through automated tuning. Learn how to improve query performance through job-specific recommendations, query run comparisons and IT chargeback reports.