I reviewed the basic setup for building applications in Kubernetes in part 1 of this blog series. In this post, I’ll explain how to use pods and controllers to create scalable processes for managing your applications.
Processes as Pods & Controllers in Kubernetes
The heart of any application is its running processes, and in Kubernetes we fundamentally create processes as pods. Pods are a bit fancier than individual containers, in that they can schedule whole groups of containers, co-located on a single host, which brings us to our first decision point:
Decision #1: How should our processes be arranged into pods?
The original idea behind a pod was to emulate a logical host – not unlike a VM. The containers in a pod will always be scheduled on the same Kubernetes node, and they’ll be able to communicate with each other via
localhost, making pods a good representation of clusters of processes that need to work together closely.
A pod can contain one or more containers, but containers in the pod must scale together.
But there’s an important consideration: it’s not possible to scale individual containers in a pod separately from each other. If you need to scale your application up, you have to add more pods, which come with copies of every container they include. Factors such as which application components will scale at similar rates, which ones will not, and which ones should reside on the same host will factor into how you arrange processes in pods.
Thinking about our web app, we might start by making a pod containing only the frontend container; we want to be able to scale this frontend independently from the rest of our application, so it should live in its own pod.
On the other hand, we might design another pod that has one container each for our database and API; this way, our API is guaranteed to be able to talk to our database on the same physical host, eliminating network latency between the API and database and maximizing performance. As noted, this comes at the expense of independent scaling; if we schedule our API and database containers in the same pod, every time we want a new instance of our database container, it’s going to come with a new instance of our API.
Case specific arguments can be made for or against this choice: Is API-to-database latency really expected to be a major bottleneck? Could it be more important to scale your API and database separately? Final decisions may vary, but the same decision points can be applied generically to many applications.
Now that we have our pods planned out (one for the frontend and one for the API-plus-database combo), we need to decide how to manage these pods. We virtually never want to schedule pods directly (called ‘bare pods’); we want to take advantage of Kubernetes controllers, which will automatically reschedule failed pods, give us some simple influence on how and where our pods are scheduled, and give us some functionality on how to update and maintain those pods. There are at least two main types of controllers we need to decide between:
Decision #2: What kind of controller should we use for each pod: a deployment or a daemonset?
- Deployments are the most common kind of controller, typically the best choice for stateless pods which can be scheduled anywhere resources are available.
- DaemonSets are appropriate for pods meant to run one per host; these are typically used for daemon-like processes, like log aggregators, filesystem managers, system monitors or other utilities that make sense to have exactly one of on every host in your Kubernetes cluster.
Most, but not all, pods are best scheduled by one of these two controllers, and of them deployments make the large majority. Since neither of our web app components make sense as cluster-wide daemons, we would schedule both of them as deployments. If later we wanted to deploy a logging or monitoring appliance, a daemonSet would be a common pattern to ensure it runs on every node in the cluster.
Now that we’ve decided on how to arrange our containers into pods and how to manage our pods using controllers, its time to write some Kubernetes yaml to capture these objects; many examples of how to do this are available in the Kubernetes documentation and Docker’s Training material.
I strongly encourage you to define your applications using Kubernetes yaml definitions, and not imperative kubectl commands. As I mentioned in the first post, one of the most important aspects of orchestrating a containerized application is shareability, and it is much easier to share a yaml file you can check into version control and distribute, rather than a series of CLI commands that can quickly become hard to read and hard to keep track of.
Checkpoint #2: write Kubernetes yaml to describe your controllers and pods.
Once you have that yaml in hand, now’s a good time to create your deployments and make sure they all work as expected: individual containers in pods should run without crashing, and containers inside the same pod should be able to reach each other on
Once you’ve mastered the pods, deployments, and daemonSets mentioned above, there are a few deeper topics you can approach to enhance your Kube applications even further:
- StatefulSets are another kind of controller appropriate for managing stateful pods; note these require an understanding of Kube services and storage (discussed below).
- Scheduling affinity rules allow you to influence and control where your pods are scheduled in a cluster, useful for sophisticated operations in larger clusters.
- Healthchecks in the form of livenessProbes are an important maintenance tool for your pods and containers, that tell Kube how to automatically monitor the health of your containers, and take action when they become unhealthy.
- PodSecurityPolicy definitions allow an added layer of security for cluster administrators to control exactly who and how pods are scheduled, commonly used to prevent the creation of pods with elevated or root privileges.
To learn more about Kubernetes pods and controllers, read the documentation:
You can also check out Play with Kubernetes, powered by Docker.
We will also be offering training on Kubernetes starting in early 2020. In the training, we’ll provide more specific examples and hands on exercises. To get notified when the training is available, sign up here:
This syndicated content is provided by Docker and was originally posted at https://blog.docker.com/2019/09/designing-your-first-application-kubernetes-processes-part2/