The Strata Data Conference is always well worth the trip, whether it’s within driving distance or clear across the country. This year’s Strata Data NYC was no exception, and we enjoyed the lively conversations and excitement of the attendees.
This show’s topics ran the gamut, from Spark to Kubernetes, from AI and machine learning to the unconscious habits of the human mind. But it’s obvious some things have drastically changed. For instance, Hadoop had far less visibility at this particular show, getting even further away from its status as “the” buzzword of the show a few years back. This year, popular topics (not surprisingly) included machine learning, containers, self-service, streaming or real-time, data governance, GDPR, and language processing. All in all, a great array of topics, enough to satisfy any attendee, including engineers, big data architects, data scientists, marketing leads, and more.
However, while we return from these shows with our creativity sparked, we also recognize a fair number of misconceptions cropping up in our discussions. One of the biggest Application Performance Monitoring (APM) misperceptions we hear relates to application tuning: It is often identified as “The Problem” when it is only one aspect of APM and often it is not actually the real issue at hand. On more than one occasion, we asked a booth visitor, “What’s your biggest big data infrastructure challenge? What are you having a problem with right now?” Too often, the response was, “We need to tune our applications.”
That may sound like a fair and accurate answer, but as we got further into a few discussions, we identified that the problem was more about understanding what’s going on in the environment rather than the application, and tuning alone is insufficient. Let’s take a look at the subtle but very important differences.
What is Application Tuning?
Application tuning is specifically the improvement of application performance, usually an exercise resulting from an actual performance issue or a practice undertaken to avoid the possibility of a problem.
Why do you need it?
You need fast and efficient applications in order to meet your SLAs and to maximize productivity. You fine tune performance in order to improve runtimes and resource utilization.
Is that all you need?
No. Don’t let other vendors tell you it is. More than once, an attendee would mention that some vendors are indeed saying that Application Tuning is all you need. It’s not. In order to performance tune your applications, you first need to be able to find out where the problem is.
- First, you need a view into all of your applications in the context of the entire cluster. You need to go beyond your own application and uncover data about what else is happening on the node and in the queue, who’s vying for the same resources you need, how much memory is allocated for the jobs before yours, and so on. Essentially, you need a 360° view of all your applications. This is very important.
- Once you have access to all this data across the cluster, you’ll be able to identify the problematic application and diagnose performance issues.
- Finally, with a comprehensive view into both application and cluster data, you can take actions to tune performance.
What is 360° Application View?
With Pepperdata Application Spotlight, you get a 360° view of all your applications in one place — in the context of the entire cluster. Pepperdata continuously collects data about all of your applications and infrastructure resources – unique data that enables you to quickly pinpoint and diagnose application performance issues up to 90% faster. Nobody else does this.
Why do you need it?
With Application Spotlight, you have self-service access to application performance data — data you can’t get anywhere else — for a 360° view of your apps and the cluster. With insight into exactly what CPU and memory resources your applications and others on the cluster have requested, what they need, what they actually use, and what they waste, you can quickly identify where problems are and tune applications to improve performance and efficiency.
Figure One – Pepperdata provides you with self-service access to all of the data on your applications in one place – a 360° view of everything going on in the cluster across all applications and resources.
Figure Two – Pepperdata provides recommendations, alarms and alerts about bottlenecks.
Is that all you need?
No, but it’s certainly a prime necessity. With the 360° Application View, as a developer, you have self-service access to all of the data on your applications in one place. This allows you to distinguish whether performance issues were caused by your application or other applications on the cluster. In addition, Application Spotlight provides recommendations on possible issues and it auto-tunes configurations for optimization of recurring applications.
For capacity and infrastructure managers, Pepperdata Platform Spotlight continuously collects extensive unique data — data not available anywhere else or with any other tool — about hosts, queues, users, applications and all relevant resources. Pepperdata collects two billion performance data points every five minutes. No other vendor does this.
- A single source of real-time big data platform and application truth, on-premise or in the cloud or both, collected from all applications and tasks, sampled every five seconds — data not available from any other source.
- Diagnose performance issues 10x faster and make informed resource decisions based on user priorities and needs.
- Automatically tune platform and applications simultaneously for peak performance and efficiency, achieving 50% improved throughput cluster-wide.
- Reclaim wasted resources dynamically at runtime for improved efficiency and savings.
- Identify rogue users and applications for rapid automatic or manual remediation.
- Automatically tune repetitive applications resulting in 70% faster runtime or reduced resource utilization.
- Accurately forecast resource needs based on cluster-wide trends using data collected, enabling informed capacity planning.
- Determine optimal software and hardware stacks based on Pepperdata’s extensive knowledge of similar environments and workloads.
- Extensive reporting on both platform and application usage patterns, costs, wasted resources, and performance bottlenecks.