What is Scalability in Cloud Computing

What is Scalability in Cloud Computing

The Strata Data Conference is always well worth the trip, whether it’s within driving distance or clear across the country. This year’s Strata Data NYC was no exception, and we enjoyed the lively conversations and excitement of the attendees.

This show’s topics ran the gamut, from Spark to Kubernetes, from AI and machine learning to the unconscious habits of the human mind. But it’s obvious some things have drastically changed. For instance, Hadoop had far less visibility at this particular show, getting even further away from its status as “the” buzzword of the show a few years back. This year, popular topics (not surprisingly) included machine learning, containers, self-service, streaming or real-time, data governance, GDPR, and language processing. All in all, a great array of topics, enough to satisfy any attendee, including engineers, big data architects, data scientists, marketing leads, and more.

However, while we return from these shows with our creativity sparked, we also recognize a fair number of misconceptions cropping up in our discussions. One of the biggest Application Performance Monitoring (APM) misperceptions we hear relates to application tuning: It is often identified as “The Problem” when it is only one aspect of APM and often it is not actually the real issue at hand. On more than one occasion, we asked a booth visitor, “What’s your biggest big data infrastructure challenge? What are you having a problem with right now?” Too often, the response was, “We need to tune our applications.”

That may sound like a fair and accurate answer, but as we got further into a few discussions, we identified that the problem was more about understanding what’s going on in the environment rather than the application, and tuning alone is insufficient. Let’s take a look at the subtle but very important differences.

What is Application Tuning?

Application tuning is specifically the improvement of application performance, usually an exercise resulting from an actual performance issue or a practice undertaken to avoid the possibility of a problem.

Why do you need it?

You need fast and efficient applications in order to meet your SLAs and to maximize productivity. You fine tune performance in order to improve runtimes and resource utilization.

Is that all you need?

No. Don’t let other vendors tell you it is. More than once, an attendee would mention that some vendors are indeed saying that Application Tuning is all you need. It’s not. In order to performance tune your applications, you first need to be able to find out where the problem is.

  1. First, you need a view into all of your applications in the context of the entire cluster. You need to go beyond your own application and uncover data about what else is happening on the node and in the queue, who’s vying for the same resources you need, how much memory is allocated for the jobs before yours, and so on. Essentially, you need a 360° view of all your applications. This is very important.
  2. Once you have access to all this data across the cluster, you’ll be able to identify the problematic application and diagnose performance issues.
  3. Finally, with a comprehensive view into both application and cluster data, you can take actions to tune performance.

What is 360° Application View?

With Pepperdata Application Spotlight