In this article, I’ll break down all of the relevant concerns when deciding where to run Kubernetes. Obviously, this assumes that you’ve already decided that Kubernetes is the way to go.
I’ll describe the main features and capabilities of the main cloud providers and present what I think are some crystal clear criteria for choosing your target platform (your own on-prem data centers, virtual service providers, or one of the popular cloud platforms).
If you need more information on Kubernetes in general feel free to check out my book Mastering Kubernetes 2nd edition.
To cloud or not to cloud?
Kubernetes is so modular, flexible, and extensible that it can be deployed on-prem, in a third-party data center, in any of the popular cloud providers and even across multiple cloud providers. What to do? What to do?
The answer, of course, is “it depends.” Here are a few scenarios:
- You run your system on-prem or in third-party data centers. You invested a lot of time, money, and training in your bespoke infrastructure. The challenges of roll-your-own infrastructure become more and more burdensome
- You run your non-Kubernetes system on one of the cloud providers. You want to benefit from the goodness of Kubernetes and brag to your friends about how cool you are. They’ll be very impressed
You’ll note that I didn’t mention containers in either scenario. If you’re already containerized — good for you. If not, consider it an entry fee.
- On-prem or third-party data center
You may have very good reasons to run in an environment that you control closely (security, regulatory, compliance, performance, cost). If this is the case, the cloud providers might be a non-starter. But, you can still reap all the benefits of Kubernetes by running it yourself. You have the expertise, skills, and experience to manage the underlying infrastructure since you’re already doing it.
However, it’s a different story if you made the choice to invest in your on-prem infrastructure or you’re deployed across multiple virtual service providers simply because you started before the cloud was a reliable solution for the enterprise. You have the opportunity to upgrade everything in one complicated shot… switch to managed infrastructure in the cloud, package your system in containers, and orchestrate them using Kubernetes!
2. You’re already in the cloud
In scenario #2, choosing to run Kubernetes managed by your cloud provider is probably a no-brainer. You already run in the cloud. Kubernetes gives you the opportunity to replace a lot of layers of management, monitoring, and security you had to build, integrate, and maintain yourself with a slick experience (that keeps getting better and better every three months when Kubernetes releases another version).
There are actually quite a few cloud providers that support Kubernetes, but I’ll focus here on the Big Three: Google’s GKE, Microsoft AKS, and Amazon’s EKS.
Google GKE (Google Kubernetes Engine)
Kubernetes, of course, came from Google. GKE is the managed offering of Kubernetes by Google. Google SREs will manage the control plane of Kubernetes for you and you get auto-upgrades. Since Google has so much influence on Kubernetes and it used it as the container orchestration solution of the Google cloud platform from day one, it would be really weird if it didn’t have the best integration.
Similarly, you can trust GKE to be the most up to date. Much more testing of new Kubernetes features and capabilities happen on GKE than other cloud providers. On GKE, you don’t have to pay for the Kubernetes control plane. Google will cover you and you just pay for the worker nodes. You also get the GCR (Goole Container Registry), integrated central logging and monitoring via Stackdriver Logging and Stackdriver Monitoring and if you’re interested in even tighter integration with your CI/CD pipeline you can use Google Code Build.
Google networking is considered top of the line compared to other cloud providers and you get to benefit from it as well. If you need to run high-performance workloads that can utilize GPUs then GKE has beta support right now.
GKE has some other neat tricks. For example, it takes advantage of general purpose Kubernetes concepts like Service and Ingress for fine-grained control over load balancing. If your Kubernetes service is of type LoadBalancer, GKE will expose it to the world via a plain L4 (TCP) load balancer. However, if you create an Ingres object in front of your service then GKE will create an L7 load balancer capable of doing SSL termination for you and even allow gRPC traffic if you annotate it correctly.
Kubernetes itself is platform agnostic. In theory, you can easily switch from any cloud platform to another as well as run on your own infrastructure. In practice, when you choose a platform provider you often want to utilize and benefit from their specific services that will require some work to migrate to a different provider or on-prem.
GKE On-Prem provides tools, integrations, and access to help unify the experience and treat on-prem clusters as if they run in the cloud. It’s not fully transparent (and it shouldn’t be), but it helps.
Microsoft Azure AKS (Azure Kubernetes Service)
Microsoft Azure originally had a solution called ACS that supported Apache Mesos, Kubernetes, and Docker Swarm. But, in October 2017 it introduced AKS as a dedicated Kubernetes hosting service and the other options fizzled out.
AKS is very similar to GKE. It also managed a Kubernetes cluster for you free of charge. It is also certified by CNCF as Kubernetes conformant (no custom hacks). Microsoft invested a lot in Kubernetes in general and AKS in particular. There is strong integration with ActiveDirectory for authentication and authorization, integrated monitoring and logging, and Azure storage. You also get built-in container registry, networking, and GPU-enabled nodes.
One of the most interesting features of AKS is its usage of the virtual-kublet project to integrate with ACI (Azure Container Instances). The ACI takes away the need to provision nodes for your cluster, which is a huge burden if you’re dealing with a highly variable load.
There are several criticisms of AKS in particularly as compared to the gold standard of GKE. Setting up a cluster on AKS takes a long time (20 minutes on average) and the startup time has high volatility (more than an hour on rare occasions).
The developer experience is relatively poor. You need some combination of a web UI (Azure Portal Manager), PowerShell, and plain CLI to provision and set everything up.
AKS is getting better, though. For example, monitoring was infamously difficult to set up and recently AKS introduced the Azure Monitor for Containers that:
- Lets you know which are running on each node and their average CPU and memory utilization
- Lets you know which containers reside in a controller or a pod (this can help you monitor overall performance)
- Review the resource utilization of workloads running on the host that are unrelated to the standard processes that support the pod
- Understand the behavior of the cluster under average and heaviest loads. This knowledge can help you identify capacity needs and determine the maximum load that the cluster can sustain
Amazon AWS EKS (Elastic Kubenetes Service)
Amazon was a little bit of a Johny come lately to the Kubernetes scene. It always had its own ECS (Elastic Container Service) container orchestration platform. But, customer demand was for Kubernetes was overwhelming. Many organizations ran their Kubernetes clusters on EC2 using Kops or similar. AWS decided to provide proper support with official integrations. EKS today integrates with IAM for identity management, AWS load balancers, networking, and various storage options.
An interesting twist is the promised integration with Fargate (similar to AKS + ACI). This will eliminate the need to provision worker nodes and potentially let Kubernetes automatically scale up and down via its HPA and VPA capabilities for a truly elastic experience.
Note that on EKS you have to pay for the managed control plane. If you just want to play around and experiment with Kubernetes or have lots of small clusters that might be a limiting factor.
As far as performance goes EKS is somewhere in the middle. It takes 10–15 minutes to start a cluster. Of course, the performance of complex distributed systems is very nuanced and can’t be captured by a single metric. That said, EKS itself is still relatively new and it might take a while until it is able to take full advantage of the robust foundation of AWS.
If you can afford to wait I’m sure EKS will improve quickly. If you want to get up and running now, you may prefer to deploy Kubernetes directly on EC2 using Kops. If you choose this route you can even utilize Fargate as demonstrated here.
Kubernetes won the container orchestration wars. The big question for you is where you should run it. Usually, the answer is simple. If you’re already running on one of the cloud providers just migrate your system to Kubernetes on that cloud platform. If you have specialized needs and run on your own hardware, you can run your own Kubernetes cluster or take the opportunity of such a big infrastructure migration project and move to the cloud already.
Plug: LogRocket, a DVR for web apps
LogRocket is a frontend logging tool that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.