Back in 2015 when Google finally lifted the lid on exactly how it managed to coordinate its massive global server resources with such efficiency and coherence, it unleashed a computing revolution on the world.
It turned out that Google was one of the first digital operations to run containerised workloads at enterprise scale – and when you’re talking Google workloads, you’re talking the largest scales imaginable. It’s secret? A pioneering container orchestration project called Borg. Post-2015, Borg got transitioned into a new open source platform that would bring container orchestration to the world. Kubernetes was born.
When we think of containers, we naturally think of application deployment and the emergence of decoupled architectures like microservices and headless. As a tool for managing large numbers of container ‘clusters’ in a predictable, consistent yet agile way, Kubernetes has also largely been associated with app development and release, particularly as a tool that links the development and operations sides in container-oriented DevOps production.
But Kubernetes has also been described as an Infrastructure as Code (IaC) tool. IaC involves the use of a programming language to provision and manage infrastructure – basically switching the emphasis in running infrastructure from hardware to software.
Through automation, IaC allows infrastructure engineers to define, deploy and manage resources across clouds, networks, virtual machines, load balancers and more without continuous monitoring and intervention. It’s a dynamic infrastructure solution that achieves consistency, control and resilience in dynamic environments.
Ultimately, these were exactly the same goals Google set out to achieve with Borg/Kubernetes – orchestrate resources for containerized workloads at massive scale in highly complex dynamic environments. It’s no great leap to say that Google developed Borg/Kubernetes as a way of managing one of the most sophisticated IT infrastructures on the planet.
The question then is, even with Kubernetes available as an open resource, why haven’t we seen it become a dominant force in infrastructure?
The storage dilemma
Kubernetes can handle most of the key things you need from infrastructure management. It can allocate memory and compute power on demand, and do so in a very dynamic, agile way, because ultimately it abstracts all the resource provisioning away from the underlying hardware.
That makes it great for things like scalability, portability and consistency across environments. However, this abstraction from the physical infrastructure comes with one big drawback – it raises the question on where you store your data.
This article provides a good overview of the challenges Kubernetes poses for storage. In short, the highly dynamic nature of a Kubernetes environment – containers continually being created and destroyed as part of the process of regulating resource allocation to pre-coded specifications – clashes with the need to store data in a consistent state over time. Containers are stateless by design – databases are all about preserving state.
Kubernetes can connect to external databases using volume plugins, but these limit the inherent flexibility of Kubernetes because the codebase has to be changed every time new storage options are added. And there are some ‘native’ storage options available within Kubernetes itself, through the creation of units called Persistent Volumes. But these are often environment-dependent, which obstructs portability – a key reason for using Kubernetes and containers in the first place.
Kubernetes beyond containers
Recent developments, however, have opened the door to Kubernetes overcoming its storage hurdle and therefore potentially plugging the last gap to be used as a mainstream infrastructure tool. In particular, the container storage interface (CSI) extension is a big leap forward, allowing block and file storage systems to be treated as containerised workloads.
More generally, a lot of work is going into extended Kubernetes functionality so it is no longer restricted to just working with containers, but can also orchestrate un-containerised workloads across data, storage, networks and more.
This is critical to Kubernetes becoming a fully-fledged infrastructure management tool that can be readily deployed by organisations that don’t have all the resources Google has to mitigate the complexities it throws up in storage and other areas.
Perhaps even more importantly, we are starting to see examples emerge where Kubernetes can add value to existing infrastructure management options, giving vendors and end users alike a business reason to adopt. One of these examples actually comes in the field of data storage.
This article details how storage specialist Pure last year acquired Kubernetes start-up Polyworx specifically to help it transition from plain storage into value-added data management. Utilising the new-found flexibility of CSI, Polyworx’s innovative solution uses Kubernetes to create a new data management middleware layer that mediates between static storage and the application layer, making data available at every node regardless of where it is actually stored.
This is a clear example of Kubernetes being applied to add something new to infrastructure – flexible data management and orchestration on top of traditional static storage. It’s these kinds of developments that could well see Kubernetes emerge as a key tool in infrastructure in years to come.