Nearly five years ago, in June 2014, Google open-sourced Kubernetes (k8s), the container orchestration platform based on software that manages the hundreds of thousands of servers that run Google. Kubernetes is based on a project called Borg that was originally developed internally at Google. Kubernetes not only beat Apache Mesos and Docker in the container orchestration race, but it has also become the hottest open source technology to emerge since the Linux operating system that commoditized enterprise UNIX operating systems and became a ubiquitous technology platform. The question now is how rapidly will it become the dominant way for enterprises to develop and deploy applications?
The four pillars of big data – volume, variety, velocity, and veracity – are not showing any signs of breaking down — on the contrary, they are stronger than ever. However, the realities of the underlying technology have changed, and with them, the architectures and the economics. Hadoop was built in a compute era with different fundamental assumptions than those that exist today. Network latency was a major bottleneck, and cloud storage was not a competitive option because of memory cost. Most data was located on-premises and co-located with the computing function. Today, network latency is not a critical issue for cloud providers, the cost of memory has plummeted, there are many cloud service providers to choose from, and hybrid cloud architectures are becoming the norm for many large enterprises.
Applications that generate and use data need to be deployed in multi-cloud and hybrid cloud environments seamlessly. This is where containers — and Kubernetes – enter the picture. Application delivery on Kubernetes starts by building applications as a set of microservices in a container-based, cloud-native architecture. Kubernetes is the product of an ongoing realignment of the software resources that comprise a network application. That alignment is centered around a concept called a workload – a job performed by one or more applications, or one or more services, across a multitude of processors.
There is nothing structurally unique that distinguishes Kubernetes from any other type of application. Its orchestrator runs on an operating system. When running, it maintains a cluster of nodes, which are servers that may be physical or virtual. On each of these nodes are pods of containers. And within each container is a client-side agent called the kubelet, which manages functions independently on behalf of the orchestrator, for the node to which it’s assigned.
Here are three reasons why Kubernetes and container orchestration are achieving an increasingly wider appeal to enterprises.
Continuity
When an application is comprised of granular components, it becomes much easier to evolve that application granularly by updating and improving those components individually. The orchestrator can make appropriate adjustments in response to how those individual changes impact the workload as a whole. Feature improvements to applications don’t have to be implemented in massive overhauls, which can sometimes negatively impact their usability. The concept of continuous integration and continuous delivery (CI/CD) can be much more easily automated by a platform that’s designed from the outset to support deployment in smaller, more manageable steps.
Resilience
Kubernetes maintains active replicas of container groups, called replica sets, for the purpose of maintaining uptime and responsiveness in the event that any container or container grouping (or pod) fails. For example, a data center does not have to replicate the entire application and trigger a load balancer to switch over to the secondary application if the primary one fails. Multiple pods in a replica set are typically running at any one time, and the orchestrator’s job is to maintain that redundancy throughout the lifespan of the application.
Scalability
The key value for organizations that orchestrate distributed workloads using Kubernetes is the built-in ability for workloads to multiply through the system as necessary, and to scale up and back down again according to established policies. To minimize the potential for problems, Kubernetes groups related containers together as pods. A service called the autoscaler can be set to automatically replicate pods to different nodes when it determines that resources allocated to those pods are not being utilized as much as they could be.
By winning the container battle, Kubernetes has made life easier for everyone. Now the industry can innovate on Kubernetes as the de facto standard for container orchestration. Kubernetes is now the foundation for the new generation of artificial intelligence, machine learning, data management, and distributed storage in cloud-native environments. But K8s still has a way to go to support stateful big data applications that require persistent volume support, automated multi-tenancy support, and enterprise-grade security. Stay tuned!