In this blog series we’ve been examining the Five Myths of Kubernetes Resource Optimization. The fourth myth we’re considering relates to a common misunderstanding held by many Kubernetes practitioners: manual application tuning can increase resource utilization in my applications. Let’s dive into it.
Manually Tuning Applications
Manual tuning refers to a developer’s ability to turn the knobs that control the CPU, memory, and other resources allocated to an application. The resource requirements for an application typically vary over time—sometimes by a great amount. There is a peak period, when resource requirements are at their greatest, and an off-peak period.
In practice, developers almost always size their applications to this peak, or even above. This ensures that the application has the right amount of resources and will not fail. However, the peak period often represents a small fraction of the overall time that an application runs. Most applications run well below this peak allocation.
Figure 1: Developers are required to allocate memory and CPU for each of their applications. To prevent their applications from being killed due to insufficient resources, developers typically set the resource request level to accommodate peak usage requirements.
Adjusting these resource request knobs for CPU and memory can help increase the utilization and reduce costs required to run applications by moving the provisioning line as close to peak as possible. This allows developers who tune manually to reclaim some of the waste due to applications running well below peak.
When a Static Provisioning Level Meets a Dynamic Application: Resource Utilization is Not Optimized
It quickly becomes obvious, however, that cannot keep up with the dynamic nature of application resource needs—leading to waste when the workload is not at peak. During the off-peak period, the application does not need all the resources that have been provisioned for it. However, nothing can be done to remediate wasted resources during this off-peak time since the amount of provisioned resources is a static designation.
The static provisioning level simply does not account for the changes that inevitably occur as the application or its data characteristics change. This off-peak time period is often very long—on the order of hours—and it commonly represents a large fraction of the application’s total run time with organizations typically wasting 30 percent or more of their resources when running data-intensive workloads on Kubernetes.
As we saw in the previous blog, Myth 3: Instance Rightsizing, a modern application’s CPU and memory requirements may change dramatically while the application is running. The instance type that was chosen for an application prior to its run may not be the optimal instance type by the end of the run.
A Special Challenge: Tuning Infrequent Applications
Applications that run infrequently present a special challenge. Some applications may run only once a week, or maybe only once a month. A developer might not be inclined to invest the effort to develop a custom provisioning profile for such an application.
Instead, the developer might simply provision that application using the same configurations selected for other, more frequent applications. As a result, the CPU and memory provisioned for such a one-off application may be inappropriate for it, which could result in arbitrary amounts of overprovisioning and waste.
Another Challenge: Keeping Utilization High in Applications with Varying Resource Requirements
Another scenario involves applications with varying resources requirements by day. Consider an application that is extremely efficient two days out of the week but relatively inefficient the other five days. Some cost-conscious developers might choose to write two different applications to accommodate this behavior: one application with configurations tuned for the efficient days and a second application with configurations tuned for the other days. In this way, the applications are optimized for each day’s requirements.
Although this practice would help optimize resources, it’s labor intensive, and few developers would be excited to take on this project. Writing and managing multiple applications in this way essentially doubles (or more) a developer’s work. As a result, most developers will simply write one application and provision it with sufficient resources to care for the worst-case scenario.
These simple examples illustrate how cumbersome manual solutions to application tuning can be—they simply do not scale.
The Opportunity Cost of Manual Application Tuning
Manually tuning any application also comes with an additional drawback, namely the significant opportunity cost it represents. As discussed in Myth 1, Observability & Monitoring, when a developer is handed a list of recommendations to improve application performance, they usually have little incentive to follow those recommendations.
While some applications might generate only a handful of tuning recommendations, others might result in a checklist of several dozen or more parameters to tweak. The developer might already be working on a to-do list of other new impactful projects, and they may resist spending the time to go back and revive an application they wrote months ago. And most companies want their developers to be developing (hence their title!) rather than tuning.
Summing It Up: Manual Application Tuning Doesn’t Address Underutilized Resources and Overspending
Given how common overprovisioning practices lead to underutilized workloads and wasted spend, manual tuning leaves money on the table when used as a resource optimization strategy. Specifically, applications with static allocation levels provisioned to meet dynamic resource requirements lead to underutilized workloads that are all but impossible to tune manually—and even more difficult to try to tune at scale.
Without real-time, automation resource optimization, there is no practical solution to dynamically match scheduled allocations to the application’s actual usage. Most organizations are reluctant to spend their development resources chasing this near-futile effort.
In our next blog entry in this series, we’ll examine the fifth and final myth about Kubernetes resource optimization, which involves Spark Dynamic Allocation. Stay tuned!