How the CSI (Container Storage Interface) Works

If you work with persistent storage in Kubernetes, maybe you've seen articles about how to migrate from in-tree to CSI volumes, but aren't sure what all the fuss is about? Or perhaps you're trying to debug a stuck VolumeAttachment that won't unmount from a node, holding up your important StatefulSet rollout? A clear understanding of what the Container Storage Interface (or CSI for short) is and how it works will give you confidence when dealing with persistent data in Kubernetes, allowing you to answer these questions and more!

The Container Storage Interface is an API specification that enables developers to build custom drivers which handle the provisioning, attaching, and mounting of volumes in containerized workloads. As long as a driver correctly implements the CSI API spec, it can be used in any supported Container Orchestration system, like Kubernetes. This decouples persistent storage development efforts from core cluster management tooling, allowing for the rapid development and iteration of storage drivers across the cloud native ecosystem.

In Kubernetes, the CSI has replaced legacy in-tree volumes with a more flexible means of managing storage mediums. Previously, in order to take advantage of new storage types, one would have had to upgrade an entire cluster's Kubernetes version to access new PersistentVolume API fields for a new storage type. But now, with the plethora of independent CSI drivers available, you can add any type of underlying storage to your cluster instantly, as long as there's a driver for it.

But what if existing drivers don't provide the features that you require and you want to build a new custom driver? Maybe you're concerned about the ramifications of migrating from in-tree to CSI volumes? Or, you simply want to learn more about how persistent storage works in Kubernetes? Well, you're in the right place! This article will describe what the CSI is and detail how it's implemented in Kubernetes.

It's APIs All the Way Down

Like many things in the Kubernetes ecosystem, the Container Storage Interface is actually just an API specification. In the container-storage-interface/spec GitHub repo, you can find this spec in 2 different versions:

  1. A protobuf file that defines the API schema in gRPC terms
  2. A markdown file that describes the overall system architecture and goes into detail about each API call

What I'm going to discuss in this section is an abridged version of that markdown file, while borrowing some nice ASCII diagrams from the repo itself!

Architecture

A CSI Driver has 2 components, a Node Plugin and a Controller Plugin. The Controller Plugin is responsible for high-level volume management; creating, deleting, attaching, detatching, snapshotting, and restoring physical (or virtualized) volumes. If you're using a driver built for a cloud provider, like EBS on AWS, the driver's Controller Plugin communicates with AWS HTTPS APIs to perform these operations. For other storage types like NFS, EXSI, ZFS, and more, the driver sends these requests to the underlying storage's API endpoint, in whatever format that API accepts.

On the other hand, the Node Plugin is responsible for mounting and provisioning a volume once it's been attached to a node. These low-level operations usually require privileged access, so the Node Plugin is installed on every node in your cluster's data plane, wherever a volume could be mounted.

The Node Plugin is also responsible for reporting metrics like disk usage back to the Container Orchestration system (referred to as the "CO" in the spec). As you might have guessed already, I'll be using Kubernetes as the CO in this post! But what makes the spec so powerful is that it can be used by any container orchestration system, like Nomad for example, as long as it abides by the contract set by the API guidelines.

The specification doc provides a few possible deployment patterns, so let's start with the most common one.

                             CO "Master" Host
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    | Controller |  |
|  |            +----------->   Plugin   |  |
|  +------------+           +------------+  |
|                                           |
+-------------------------------------------+

                            CO "Node" Host(s)
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    |    Node    |  |
|  |            +----------->   Plugin   |  |
|  +------------+           +------------+  |
|                                           |
+-------------------------------------------+

Figure 1: The Plugin runs on all nodes in the cluster: a centralized
Controller Plugin is available on the CO master host and the Node
Plugin is available on all of the CO Nodes.

Since the Controller Plugin is concerned with higher-level volume operations, it does not need to run on a host in your cluster's data plane. For example, in AWS, the Controller makes AWS API calls like ec2:CreateVolume, ec2:AttachVolume, or ec2:CreateSnapshot to manage EBS volumes. These functions can be run anywhere, as long as the caller is authenticated with AWS. All the CO needs is to be able to send messages to the plugin over gRPC. So in this architecture, the Controller Plugin is running on a "master" host in the cluster's control plane.

On the other hand, the Node Plugin must be running on a host in the cluster's data plane. Once the Controller Plugin has done its job by attaching a volume to a node for a workload to use, the Node Plugin (running on that node) will take over by mounting the volume to a well-known path and optionally formatting it. At this point, the CO is free to use that path as a volume mount when creating a new containerized process; so all data on that mount will be stored on the underlying volume that was attached by the Controller Plugin. It's important to note that the Container Orchestrator, not the Controller Plugin, is responsible for letting the Node Plugin know that it should perform the mount.

Volume Lifecycle

The spec provides a flowchart of basic volume operations, also in the form of a cool ASCII diagram:

   CreateVolume +------------+ DeleteVolume
 +------------->|  CREATED   +--------------+
 |              +---+----^---+              |
 |       Controller |    | Controller       v
+++         Publish |    | Unpublish       +++
|X|          Volume |    | Volume          | |
+-+             +---v----+---+             +-+
                | NODE_READY |
                +---+----^---+
               Node |    | Node
            Publish |    | Unpublish
             Volume |    | Volume
                +---v----+---+
                | PUBLISHED  |
                +------------+

Figure 5: The lifecycle of a dynamically provisioned volume, from
creation to destruction.

Mounting a volume is a synchronous process: each step requires the previous one to have run successfully. For example, if a volume does not exist, how could we possibly attach it to a node?

When publishing (mounting) a volume for use by a workload, the Node Plugin first requires that the Controller Plugin has successfully published a volume at a directory that it can access. In practice, this usually means that the Controller Plugin has created the volume and attached it to a node. Now that the volume is attached, it's time for the Node Plugin to do its job. At this point, the Node Plugin can access the volume at its device path to create a filesystem and mount it to a directory. Once it's mounted, the volume is considered to be published and it is ready for a containerized process to use. This ends the CSI mounting workflow.

Continuing the AWS example, when the Controller Plugin publishes a volume, it calls ec2:CreateVolume followed by ec2:AttachVolume. These two API calls allocate the underlying storage by creating an EBS volume and attaching it to a particular instance. Once the volume is attached to the EC2 instance, the Node Plugin is free to format it and create a mount point on its host's filesystem.

Here is an annotated version of the above volume lifecycle diagram, this time with the AWS calls included in the flow chart.

   CreateVolume +------------+ DeleteVolume
 +------------->|  CREATED   +--------------+
 |              +---+----^---+              |
 |       Controller |    | Controller       v
+++         Publish |    | Unpublish       +++
|X|          Volume |    | Volume          | |
+-+                 |    |                 +-+
                    |    |
 <ec2:CreateVolume> |    | <ec2:DeleteVolume>
                    |    |
 <ec2:AttachVolume> |    | <ec2:DetachVolume>
                    |    |
                +---v----+---+
                | NODE_READY |
                +---+----^---+
               Node |    | Node
            Publish |    | Unpublish
             Volume |    | Volume
                +---v----+---+
                | PUBLISHED  |
                +------------+

If a Controller wants to delete a volume, it must first wait for the Node Plugin to safely unmount the volume to preserve data and system integrity. Otherwise, if a volume is forcibly detatched from a node before unmounting it, we could experience bad things like data corruption. Once the volume is safely unpublished (unmounted) by the Node Plugin, the Controller Plugin would then call ec2:DetachVolume to detatch it from the node and finally ec2:DeleteVolume to delete it, assuming that the you don't want to reuse the volume elsewhere.

What makes the CSI so powerful is that it does not prescribe how to publish a volume. As long as your driver correctly implements the required API methods defined in the CSI spec, it will be compatible with the CSI and by extension, be usable in COs like Kubernetes and Nomad.

Running CSI Drivers in Kubernetes

What I haven't entirely make clear yet is why the Controller and Node Plugins are plugins themselves! How does the Container Orchestrator call them, and where do they plug into?

Well, the answer depends on which Container Orchestrator you are using. Since I'm most familiar with Kubernetes, I'll be using it to demonstrate how a CSI driver interacts with a CO.

Deployment Model

Since the Node Plugin, responsible for low-level volume operations, must be running on every node in your data plane, it is typically installed using a DaemonSet. If you have heterogeneous nodes and only want to deploy the plugin to a subset of them, you can use node selectors, affinities, or anti-affinities to control which nodes receive a Node Plugin Pod. Since the Node Plugin requires root access to modify host volumes and mounts, these Pods will be running in privileged mode. In this mode, the Node Plugin can escape its container's security context to access the underlying node's filesystem when performing mounting and provisioning operations. Without these elevated permissions, the Node Plugin could only operate inside of its own containerized namespace without the system-level access that it requires to provision volumes on the node.

The Controller Plugin is usually run in a Deployment because it deals with higher-level primitives like volumes and snapshots, which don't require filesystem access to every single node in the cluster. Again, lets think about the AWS example I used earlier. If the Controller Plugin is just making AWS API calls to manage volumes and snapshots, why would it need access to a node's root filesystem? Most Controller Plugins are stateless and highly-available, both of which lend themselves to the Deployment model. The Controller also does not need to be run in a privileged context.

Event-Driven Sidecar Pattern

Now that we know how CSI plugins are deployed in a typical cluster, it's time to focus on how Kubernetes calls each plugin to perform CSI-related operations. A series of sidecar containers, that are registered with the Kubernetes API server to react to different events across the cluster, are deployed alongside each Controller and Node Plugin. In a way, this is similar to the typical Kubernetes controller pattern, where controllers react to changes in cluster state and attempt to reconcile the current cluster state with the desired one.

There are currently 6 different sidecars that work alongside each CSI driver to perform specific volume-related operations. Each sidecar registers itself with the Kubernetes API server and watches for changes in a specific resource type. Once the sidecar has detected a change that it must act upon, it calls the relevant plugin with one or more API calls from the CSI specification to perform the desired operations.

Controller Plugin Sidecars

Here is a table of the sidecars that run alongside a Controller Plugin:

Sidecar NameK8s Resources WatchedCSI API Endpoints Called
external-provisionerPersistentVolumeClaimCreateVolume,DeleteVolume
external-attacherVolumeAttachmentController(Un)PublishVolume
external-snapshotterVolumeSnapshot(Content)CreateSnapshot,DeleteSnapshot
external-resizerPersistentVolumeClaimControllerExpandVolume

How do these sidecars work together? Let's use an example of a StatefulSet to demonstrate. In this example, we're dynamically provisioning our PersistentVolumes (PVs) instead of mapping PersistentVolumeClaims (PVCs) to existing PVs. We start at the creation of a new StatefulSet with a VolumeClaimTemplate.

---
apiVersion: apps/v1
kind: StatefulSet
spec:
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "my-storage-class"
      resources:
        requests:
         storage: 1Gi

Creating this StatefulSet will trigger the creation of a new PVC based on the above template. Once the PVC has been created, the Kubernetes API will notify the external-provisioner sidecar that this new resource was created. The external-provisioner will then send a CreateVolume message to its neighbor Controller Plugin over gRPC. From here, the CSI driver's Controller Plugin takes over by processing the incoming gRPC message and will create a new volume based on its custom logic. In the AWS EBS driver, this would be an ec2:CreateVolume call.

At this point, the control flow moves to the built-in PersistentVolume controller, which will create a matching PV and bind it to the PVC. This allows the StatefulSet's underlying Pod to be scheduled and assigned to a Node.

Here, the external-attacher sidecar takes over. It will be notified of the new PV and call the Controller Plugin's ControllerPublishVolume endpoint, mounting the volume to the StatefulSet's assigned node. This would be the equivalent to ec2:AttachVolume in AWS.

At this point, we have an EBS volume that is mounted to an EC2 instance, all based on the creation of a StatefulSet, PersistentVolumeClaim, and the work of the AWS EBS CSI Controller Plugin.

Node Plugin Sidecars

There is only one unique sidecar that is deployed alongside the Node Plugin; the node-driver-registrar. This sidecar, running as part of a DaemonSet, registers the Node Plugin with a Node's kubelet. During the registration process, the Node Plugin will inform the kubelet that it is able to mount volumes using the CSI driver that it is part of. The kubelet itself will then wait until a Pod is scheduled to its corresponding Node, at which point it is then responsible for making the relevant CSI calls (PublishVolume) to the Node Plugin over gRPC.

Common Sidecars

There is also a livenessprobe sidecar that runs in both the Container and Node Plugin Pods that monitors the health of the CSI driver and reports back to the Kubernetes Liveness Probe mechanism.

Communication Over Sockets

How do these sidecars communicate with the Controller and Node Plugins? Over gRPC through a shared socket! So each sidecar and plugin contains a volume mount pointing to a single unix socket.

CSI Controller Deployment

This diagram highlights the pluggable nature of CSI Drivers. To replace one driver with another, all you have to do is simply swap the CSI Driver container with another and ensure that it's listening to the unix socket that the sidecars are sending gRPC messages to. Becase all drivers advertise their own different capabilities and communicate over the shared CSI API contract, it's literally a plug-and-play solution.

Conclusion

In this article, I only covered the high-level concepts of the Container Storage Interface spec and implementation in Kubernetes. While hopefully it has provided a clearer understanding of what happens once you install a CSI driver, writing one requires significant low-level knowledge of both your nodes' operating system(s) and the underlying storage mechanism that your driver is implementing. Luckily, CSI drivers exist for a variety of cloud providers and distributed storage solutions, so it's likely that you can find a CSI driver that already fulfills your requirements. But it always helps to know what's happening under the hood in case your particular driver is misbehaving.

If this article interests you and you want to learn more about the topic, please let me know! I'm always happy to answer questions about CSI Drivers, Kubernetes Operators, and a myriad of other DevOps-related topics.

What Are Kubernetes Operators? (Operators 101: Part 1)

If you've used Kubernetes in just about any capacity over the past few years, you've almost certainly leaned on at least one operator to automate a piece of your infrastructure.

The operator model has been espoused by influencers like Viktor Farcic (aka @DevOps Toolkit on YouTube) and Kelsey Hightower for its advanced automation abilities. OperatorHub.io lists over 300 different operators for you to install in your clusters. Kubernetes operators seem to be everywhere!

But what exactly is an operator? The official documentation page about operators is actually fairly short and leaves a lot to the reader's imagination.

This is the start of a new series, Operators 101, where I will teach readers how to design, build, and deploy Kubernetes operators that can automate the management of your own unique applications.

For a real-world case study of the benefits that the operator pattern has over traditional k8s resource management techniques, check out this article that I wrote which discusses the benefits of moving my company's SaaS infrastructure to a Kubernetes operator.

The Operator Pattern

What makes Kubernetes operators so difficult to describe is that they are not just one discrete unit. Instead, an operator is actually comprised of several distinct components!

When trying to explain what operators are to people who haven't dealt with them before, I like to describe them as new native Kubernetes resources that enable you to automate system components based on the resources' state and your own custom logic.

At a minimum, an operator consists of a Custom Resource Definition (which defines your custom object's schema) and a Controller that performs actions based on the state of Custom Resources created from your definition. I like to think of a Custom Resource Definition (or abbreviated CRD) as analogous to a class definition in Java, and a Custom Resource can be thought of as an instance of that class.

To provide a more concrete example of how these components work together, let's investigate how a commonly-used Kubernetes resource works under the hood and then relate this back to the operator pattern.

Analogy: How a Deployment Works

A Deployment's schema can be found in the Kubernetes API specification in the apps/v1 group. You can check out the full spec in the Kubernetes reference documentation, here. This spec contains the definition of a Deployment kind, or schema, which includes all of its fields, their descriptions, and types. I like to think of this schema as a Go struct; a container that can hold structured data of a particular shape. This API spec, or schema, is equivalent to a Custom Resource Definition in an operator, which defines the fields of your new Custom Resource and its metadata.

Continuing the analogy, when you want to create a Deployment, one could write a yaml (or json) file that uses the Deployment API schema. Here's a simple example:

---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: nginx
spec:
 replicas: 1
 containers:
 - name: nginx
   image: nginx:1.23.3

This yaml file represents a Custom Resource, or an instance of your Custom Resource Definition.

When you kubectl create -f the file, the kubectl client sends a POST request to the Kubernetes API server with these file contents in the request body. Once the request has been validated, the server adds the new Deployment object to the Kubernetes control plane's key-value database, etcd.

Elsewhere in the control plane, there is a Deployment Controller that has registered itself with the Kubernetes API server to listen for adds, modifications, or deletes for any Deployment in the cluster. Once it has been notified, the controller will then run a reconciliation loop to migrate the current cluster state to its desired state, as specified by the mutated object. This is similar to how declarative programming languages, like Terraform, work.

Getting back to our example, once the API server has stored the new Deployment spec in etcd, the Deployment Controller will be notified about the newly created resource. This will trigger a reconciliation between the current state of the cluster (where the Deployment and its underlying ReplicaSet do not exist) and the desired state of Deployment, as defined by the spec stored in etcd.

How a Deployment Works

It's important to note that the Deployment Manager does not directly talk to etcd. All communication with the cluster data store happens through the API server. I just added the step (4) line for clarity in this initial example. All future diagrams will display the correct data flow.

Since the Deployment did not previously exist, the ReplicaSet that it is supposed to control also does not exist. So in its reconciliation loop, the Deployment Controller will create a new ReplicaSet based on the Pod template portion and the number of replicas in the Deployment spec. Once the ReplicaSet has been created, the controller's job now is to reconcile this ReplicaSet with the Deployment's spec whenever a change to the Deployment spec occurs.

For example, lets say that you want increase the desired number of replicas to 2 by changing the yaml to:

---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: nginx
spec:
 replicas: 2  ## changed from 1
 containers:
 - name: nginx
   image: nginx:1.23.3

Once you kubectl apply this yaml, sending a POST request to the API server, the Deployment Controller will be notified about this new change. It will perform another reconciliation of the underlying ReplicaSet, see that there is only 1 replica compared to the desired number of replicas (2) as per the Deployment spec, and modify the underlying ReplicaSet to increase the number of replicas to 2.

Composability

What makes this model so powerful is the inherent composability of Kubernetes primitives. In the above example, the Deployment Controller has one very specific job: to manage ReplicaSets. But as users, we don't typically directly interact with ReplicaSets. Instead, we're much more concerned with the state of Pods managed by a ReplicaSet, the runtimes where our applications live and breathe.

The work of actually managing Pods by scaling them up-and-down is the responsibility of the ReplicaSet controller. So the Deployment Controller delegates that work to the ReplicaSet controller, and trusts that it will do the right thing when a Deployment's desired number of replicas changes.

How a ReplicaSet Works

As operator writers, we can use this property to rely on existing Kubernetes primitives to handle single tasks and combine various components to perform more complex automation to solve our own unique problems. Which specific components we use and how we link them together is determined by you, the writer of the operator.

So What Is An Operator?

Now that we're familiar with how Deployments, and other Kubernetes resources, work under the hood, we can finally get to a clearer definition of a Kubernetes operator.

To work with any Kubernetes resource, we know that we need three components:

  1. An API Schema (like the Deployments API spec in the apps/v1 group)
  2. A concrete instance of that schema (a Deployment resource stored in etcd)
  3. A controller that performs reconciliation actions based on a resource's state

An operator is a combination of these three components, designed to orchestrate a custom schema that you create in a way that you write. So in operator terms, these components are:

  1. A Custom Resource Definition that defines your new resource's schema
  2. Custom Resources which are instances of your new resource type
  3. A Controller that performs automation actions based on the state of Custom Resources

Now you see how an operator is not just a single thing, but actually a combination of several pieces that work together. The idea of multiple smaller components working together to accomplish larger tasks is fundamental to not just how operators work, but also to how the entire Kubernetes ecosystem operates. By writing an operator of your own, you will not just gain additional automation benefits, but you will also grasp a deeper understanding of how Kubernetes works under the hood.

Stay Tuned!

I hope that this article has provided some clarity into what operators are and why they can be a powerful tool in your automation toolbox. The next article in this series will present a sample Custom Resource Definition and discuss how this new spec can drive your own custom automation.

If you like this content and want to follow along with the series, I'll be writing more articles under the Operators 101 tag on this blog.

K3s Traefik Ingress - configured for your homelab!

I recently purchased a used Lenovo M900 Think Centre (i7 with 32GB RAM) from eBay to expand my mini-homelab, which was just a single Synology DS218+ plugged into my ISP's router (yuck!). Since I've been spending a big chunk of time at work playing around with Kubernetes, I figured that I'd put my skills to the test and run a k3s node on the new server. While I was familiar with k3s before starting this project, I'd never actually run it before, opting for tools like kind (and minikube before that) to run small test clusters for my local development work.

K3s comes "batteries-included" and is packaged with all of the core components that are needed to run a cluster. For ingress, this component is Traefik, something which I also never had any hands-on experience with.

After a bit of trial-and-error, along with some help from the docs, I was able to figure out how to configure the Traefik ingress to expose a service on the new node to the rest of my LAN. This post will go into detail about how I did it, in case you've also installed k3s on a node in your own lab and want to know how to use the pre-bundled Traefik ingress.

Installing k3s

The k3s installation process is fairly straightforward. It just requires curling https://get.k3s.io and executing the script. The default settings all make sense to me, except for the need to chmod the bundled kubeconfig since its default filemode is inaccessible to non-root users. Luckly, k3s has a solution for that! All you need to do is export K3S_KUBECONFIG_MODE="644" before you run the installation, and the k3s install script does the rest.

Configuring Traefik

By default, host ports 80 and 443 are exposed by the bundled Traefik ingress controller. This will let you create HTTP and HTTPS ingresses on their standard ports.

But of course, I want to run a QuestDB instance on my node, which uses two additional TCP ports for Influx Line Protocol (ILP) and Pgwire communication with the database. So how can I expose these extra ports on my node and route traffic to the QuestDB container running inside of k3s?

K3s deploys Traefik via a Helm chart and allows you to modify that chart's values.yaml through a HelmChartConfig resource. To further customize values.yaml files for installed charts, you can place extra HelmChartConfig manifests in /var/lib/rancher/k3s/server/manifests. Here is my /var/lib/rancher/k3s/server/manifests/traefik-config.yaml:

---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: traefik
  namespace: kube-system
spec:
  valuesContent: |-
    ports:
      psql:
        port: 8812
        expose: true
        exposedPort: 8812
        protocol: TCP
      ilp:
        port: 9009
        expose: true
        exposedPort: 9009
        protocol: TCP

This manifest modifies the Traefik deployment to serve two extra TCP ports on the host: 8812 for Pgwire and 9009 for ILP. Along with these new ports, Traefik also ships with its default "web" and "websecure" ports that route HTTP and HTTPS traffic respectively. But this is already preconfigured for you in the Helm chart, so there's no extra work involved.

So now that we have our ports exposed, lets figure out how to route traffic from them to our QuestDB instance running inside the cluster.

IngressRoutes

While you can use traditional k8s Ingresses to configure external access to cluster resources, Traefik v2 also includes new, more flexible types of ingress that coordinate directly with the Traefik deployment. These can be configured by using Traefik-specific Custom Resources, which allow users to specify cluster ingress routes using Traefik's custom routing rules instead of the standard URI-based routing traditionally found in k8s. Using these rules, you can route requests not just based on hostnames and paths, but also by request headers, querystrings, and source IPs, with regex matching support for many of these options. This unlocks significantly more flexibility when routing traffic into your cluster as opposed to using a standard k8s ingress.

Here's an example of an IngressRoute that I use to expose the QuestDB web console.

---
kind: IngressRoute
metadata:
  name: questdb
  namespace: questdb
spec:
  entryPoints:
    - web
  routes:
    - kind: Rule
      match: Host(`my-hostname`)
      services:
        - kind: Service
          name: questdb
          port: 9000

This CRD maps the QuestDB service port 9000 to host port 80 through the web entrypoint, which as I mentioned above, comes pre-installed in the Traefik Helm chart. With this config, I can now access the console by navigating to http://my-hostname/ in my web browser and Traefik will route the request to my QuestDB HTTP service.

IngressRouteTCP

It's important to note that IngressRoutes are used solely for HTTP ingress routing. Raw TCP routing is done using IngressRouteTCPs, and there are also IngressRouteUDPs available for you to use as well.

To support the TCP-based ILP and Pgwire protocols, I created 2 IngressRouteTCP resources to handle traffic on host ports 9009 and 8812.

---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRouteTCP
metadata:
  name: questdb-ilp
  namespace: questdb
spec:
  entryPoints:
    - ilp
  routes:
    - match: HostSNI(`*`)
      services:
        - name: questdb
          port: 9009
          terminationDelay: -1
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRouteTCP
metadata:
  name: questdb-psql
  namespace: questdb
spec:
  entryPoints:
    - psql
  routes:
    - match: HostSNI(`*`)
      services:
        - name: questdb
          port: 8812

Here, we use the TCP-specific HostSNI matcher to route all node traffic on ports 9009 (ilp) and 8812 (psql) to the questdb service. The ilp and psql entrypoints correspond to the ilp and psql ports that we exposed in the traefik-config.yaml.

Now we're able to access QuestDB on all 3 supported ports on my k3s node from the rest of my home network.

Conclusion

I hope this example gives you a bit more confidence when configuring Traefik on a single-node k3s "cluster". It may be a bit confusing at first, but once you have the basics down, you'll be exposing all of your hosted services in no time!

Since we're only working with one node, there's a limited amount of routing flexibility that I could include in the Traefik CRD configurations. In a more complex environment, you can really go wild with all of the possibilities that are provided by Traefik routing rules! Still, the benefit of this simple example is that you can learn the basics in a small and controlled environment, and then add complexity once it's needed.

Remember, if you're using k3s clustering mode and running multiple nodes, you'll need to route traffic using an external load balancer like Haproxy. Maybe I'll play around with this if I ever pick up a second M900 or other mini-desktop. But until then, I'll stick with this ingress configuration.