☸️Kubernetes Concepts

The container

Before even presenting Kubernetes, it's better we start from the very basic: the container.

A container is a lightweight, semi-isolated environment that has everything it needs to run a specific process. While it shares the host machine’s Linux kernel, it operates with its own virtual filesystem, network stack, and user space.

This isolation is made possible by two key Linux kernel features:

Namespaces – These isolate resources like process trees, networking, and mount points between containers.
Cgroups – Short for “control groups,” these limit and monitor the amount of CPU, memory, and other system resources each container can use.

A well-known software that combine all of these features is Docker, a container runtime that builds, manages, and runs containers.

Declarative deployment using Docker Compose

Initially, deploying a container looked like this:

shell

# run: Deploy a container
# -d: (detach) run in the background
# --name myapp: Give the container a name
# -p 80:80: Connect the port 80 of the host (left-side), to the container (right-side).
# myapp: The name of the image.
docker run -d --name myapp -p 80:80 myapp

But there’s a catch: if you ever need to redeploy the container, you have to remember and rerun the exact same command. And unless it was written down somewhere, it could be lost or inconsistent between environments.

To address this, engineers started using Docker Compose. It wraps those docker commands into a single declarative YAML file: docker-compose.yaml, that you can commit to version control and reuse consistently.

yaml: docker-compose.yaml

services:
  myapp:
    image: myapp
    ports:
      - '80:80'

This made container deployment more reproducible. But it didn’t solve everything.

While Docker Compose helps organize and deploy containers, it starts to show its limits in more complex or production-like scenarios:

Scaling is difficult:
- You have to manually replicate containers.
- Volumes are tightly bound to the host machine.
- Load balancing has to be configured separately.
Upgrades are risky:
- Replacing a container usually means stopping the running one first.
- If the new container fails to start correctly, users experience downtime: healthy traffic is interrupted before the new instance is checked.
- No automatic rollback if something goes wrong.
Single Points of Failure:
- No native failover: Docker’s data (volumes, images) isn’t replicated.
- No container redundancy or high availability.
- Network and storage are tightly coupled to a single host.

In short, every layer of a Compose-based stack is a potential failure point. This led to serious challenges for our SaaS offering, and even more so for power users of our Self-Hosted deployments.

Introducing Kubernetes to solve availability

If Docker is the brain behind container deployment on a single machine, then Kubernetes is the brain coordinating deployment across many machines.

Kubernetes isn't very different from Docker and Docker Compose. In fact, Kubernetes is a superset of Docker and Docker Compose. It builds on their ideas but extends them to support scaling, resiliency, and high availability.

Instead of using docker-compose.yaml, Kubernetes introduces the concept of a Pod, which is defined in a YAML file:

yaml: pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  containers:
    - name: myapp
      image: myapp
      ports:
        - containerPort: 80
          hostPort: 80

Functionally speaking, a Pod offers the same capabilities to a Docker Compose deployment. But, because of the same design, it shares the same core limitations: no built-in replication, no safe upgrades, and no failover mechanisms.

To solve that, Kubernetes wraps Pods in another object: the Deployment.

yaml: deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 1
  selector: # selector is used to find the Pod
    matchLabels:
      app: myapp # Must match the pod's labels
  template: # template is a Pod, as written above.
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: myapp
          ports:
            - containerPort: 80

Deployments immediatly offers:

Safe rollout strategies: Old Pods are only terminated once the new ones are confirmed healthy.
Replication: Easily scale up by running multiple instances (replicas) of the Pod, even across different machines.

But to handle traffic across those replicas, Kubernetes needs a way to distribute incoming connections. Since you can’t bind the same port on multiple containers on the same host, Kubernetes introduces the concept of a Service, which acts as a built-in Layer 4 (TCP/UDP) load balancer.

yaml: service.yaml

apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  type: ClusterIP
  selector: # selector is used to find the Pod
    app: myapp # Must match the pod's labels
  ports:
    - protocol: TCP
      port: 80 # Port exposed by the service
      targetPort: 80 # Must match the container's port

More precisely, Service simply serves traffic on healthy replicas of the Pod using known Linux technologies: iptables (the linux firewall) and veth (virtual ethernet device).

As you can see, Kubernetes doesn’t reinvent the wheel. It simply builds on top of proven, existing tools, wrapping them into a system that helps you scale out your containers reliably and safely.

And compared to full-blown virtualization platforms like OpenStack or Proxmox, Kubernetes is relatively lightweight and easier to adopt, while solving production-grade problems around upgrades, networking, and high availability.

So the question is: why does it feel harder to use Kubernetes than Docker Compose?

Well, there is actually a cognitive overhead. Kubernetes introduces a lot of concepts, but not all of them are essential to get started.

That’s a lot of YAML just to get a database running... even though all you really want is to set a few configuration values.

In Docker, this reduces to:

shell: /work/

# run: Deploy a container
# -d: (detach) run in the background
# -e KEY=VALUE: set an environment variable
# -v /host:/container: bind a host directory to a container directory
docker run -d \
	--name some-postgres \
	-e POSTGRES_PASSWORD=mysecretpassword \
	-e PGDATA=/var/lib/postgresql/data/pgdata \
	-v /custom/mount:/var/lib/postgresql/data \
	postgres

Introducing Helm

To reduce the cognitive overhead of managing raw Kubernetes YAML, we use Helm, a templating engine and package manager for Kubernetes.

Helm works by combining:

Template files that generate Kubernetes manifests behind the scenes.
A central values.yaml file that defines configurable defaults.
An optional override mechanism, using either patched YAML or inline flags.

This abstraction allows you to deploy complex apps without drowning in dozens of YAML files. It also lets maintainers expose only the necessary knobs. No need to manually wire every port, label, or secret.

For example, to deploy PostgreSQL, you can simply run:

shell

# postgresql: name of the deployment
# oci://registry-1.docker.io/bitnamicharts/postgresql: The Helm chart to deploy, hosted in a Container registry
helm install postgresql oci://registry-1.docker.io/bitnamicharts/postgresql

It's that simple! And if you want to customize the PostgreSQL password. Just create a values.yaml file:

yaml: values.yaml

# Full list of configurable options:
# https://github.com/bitnami/charts/blob/main/bitnami/postgresql/values.yaml
global:
  postgresql:
    auth:
      postgresPassword: mysecretpassword

Then apply it like this:

shell

helm install postgresql oci://registry-1.docker.io/bitnamicharts/postgresql --values values.yaml

Or, if you're in a hurry:

shell

helm install postgresql oci://registry-1.docker.io/bitnamicharts/postgresql \
  --set global.postgresql.auth.postgresPassword=mysecretpassword

Helm drastically reduces boilerplate and repetition. It gives us, at Toucan, a way to offer clear, focused customization instructions while still allowing you the freedom to tailor the deployment to your environment and needs.

Helm makes deploying and configuring Kubernetes applications much easier, and that’s exactly how we’ll deploy Toucan!

But before we do that, it’s worth taking a moment to get familiar with a few essential Kubernetes concepts: the stuff you’ll actually touch or maintain when running Toucan in production.

What you actually need to know about Kubernetes

The Kubernetes' official documentation offers in-depth explanations of its architecture and components. However, a full understanding of Kubernetes internals is not required to self-host Toucan.

As with Docker Compose, it is not necessary to understand the underlying networking stack to deploy a working setup. What matters is knowing how to configure the relevant components.

Common Concepts

If you're comfortable with Docker Compose, you're already 90% of the way to understanding how to use a Kubernetes Pod. And, since we are abstracting the configuration with Helm, you shouldn't need to go too far in depth.

However, if you ever need to learn more, here are the common concepts:

Pod: Basic execution unit in Kubernetes. Represents one or more containers with shared storage and network. Equivalent to Docker Compose.
Deployment: Manages stateless Pods, including rolling updates and replica scaling.
StatefulSet: Manages stateful Pods with persistent identity and stable storage.
Service: Provides a stable network endpoint and load balances traffic to a set of Pods.
Ingress: Manages HTTP(S) routing to Services based on hostname and path rules. It's actually one of the "high-level" abstractions of Kubernetes.
Persistent Volume (PV): Represents storage resources in the cluster. Usually provisioned dynamically, so you don't actually need to worry about it.
Persistent Volume Claim (PVC): A request for storage by a Pod; typically used to bind to a PV.
ConfigMap/Secret: Used for injecting configuration and sensitive data into Pods at runtime. Since volumes are dynamic, it's not possible to provision a configuration file in a PV. Instead, we have to rely on ConfigMaps and Secrets.

If you ever need help to visualize the deployment, feel free to check Headlamp. The postgres diagram from above is provided by Headlamp.

Maintenance Overview

Unless Kubernetes is installed manually using tools like kubeadm, most distributions handle system-level maintenance tasks such as certificate rotation, updates, and basic network configuration.

Distributions such as k3s, k0s, Rancher, and Talos Linux provide automation around cluster management and have proven to be suitable for production deployments.

The only maintenance that has to be set up manually are backups. Thankfully, Kubernetes stores all cluster state in a single backend, making it straightforward to back up and restore.

If you opt for a single node control plane, simply look for the SQLite database which holds the whole Kubernetes state. For example, with k3s, simply export the directory /var/lib/rancher/k3s/server/db/ and the /var/lib/rancher/k3s/server/token.

If you opt for multi nodes control plane, the "brain" is distributed accross the nodes using ETCD, a distributed, strongly consistent, key-value database. Thanks to that "strongly consistent" design, the backup also take form of copying the data directory. ETCD also offer a simple tool: etcdctl snapshot save snapshot.db.

Kubernetes only has a single source of truth! So moving/restoring these backups will proceed into restoring the whole infrastructure!

Single-Node Kubernetes

Tools like k3s, a lightweight Kubernetes distribution, demonstrate that Kubernetes can be efficient and manageable on a single node. With k3s, you can deploy a production-grade Kubernetes cluster on a modest machine, like a Raspberry Pi.

In such cases, SQLite is used as the backing store, reducing operational complexity while retaining the benefits of Kubernetes features such as declarative configuration, self-healing workloads, and Helm-based deployment.

For development, Minikube enables running a full-featured local Kubernetes cluster using Docker as both the container runtime and virtualization driver. This allows developers to replicate production-like environments with minimal effort.

What's next

Now that you've learned about the benefits of Kubernetes and Helm, you're ready to deploy Toucan on your Kubernetes cluster!

But before that, let's check if you have the prerequisites.

✅Prerequisites

Last updated 1 month ago

Was this helpful?