Concepts

Summary of Talos Linux.

When people come across Talos, they frequently want a nice, bite-sized summary of it. This is surprisingly difficult when Talos represents such a fundamentally-rethought operating system.

Not based on X distro

A useful way to summarize an operating system is to say that it is based on X, but focused on Y. For instance, Mint was originally based on Ubuntu, but focused on Gnome 2 (instead of, at the time, Unity). Or maybe something like Raspbian is based on Debian, but it is focused on the Raspberry Pi. CentOS is RHEL, but made license-free.

Talos Linux isn’t based on any other distribution. We often think of ourselves as being the second-generation of container-optimised operating systems, where things like CoreOS, Flatcar, and Rancher represent the first generation, but that implies heredity where there is none.

Talos Linux is actually a ground-up rewrite of the userspace, from PID 1. We run the Linux kernel, but everything downstream of that is our own custom code, written in Go, rigorously-tested, and published as an immutable, integrated, cohesive image. The Linux kernel launches what we call machined, for instance, not systemd. There is no systemd on our system. There are no GNU utilities, no shell, no SSH, no packages, nothing you could associate with any other distribution. We don’t even have a build toolchain in the normal sense of the word.

Not for individual use

Technically, Talos Linux installs to a computer much as other operating systems. Unlike other operating systems, Talos is not meant to run alone, on a single machine. Talos Linux comes with tooling from the very foundation to form clusters, even before Kubernetes comes into play. A design goal of Talos Linux is eliminating the management of individual nodes as much as possible. In order to do that, Talos Linux operates as a cluster of machines, with lots of checking and coordination between them, at all levels.

Break from your mind the idea of running an application on a computer. There are no individual computers. There is only a cluster. Talos is meant to do one thing: maintain a Kubernetes cluster, and it does this very, very well.

The entirety of the configuration of any machine is specified by a single, simple configuration file, which can often be the same configuration file used across many machines. Much like a biological system, if some component misbehaves, just cut it out and let a replacement grow. Rebuilds of Talos are remarkably fast, whether they be new machines, upgrades, or reinstalls. Never get hung up on an individual machine.

Control Planes are not linear replicas

People familiar with traditional relational database replication often overlook a critical design concept of the Kubernetes (and Talos) database: etcd. Unlike linear replicas, which have dedicated masters and slaves/replicas, etcd is highly dynamic. The master in an etcd cluster is entirely temporal. This means fail-overs are handled easily, and usually without any notice of operators. This also means that the operational architecture is fundamentally different.

Properly managed (which Talos Linux does), etcd should never have split brain and should never encounter noticeable down time. In order to do this, though, etcd maintains the concept of “membership” and of “quorum”. In order to perform any operation, read or write, the database requires quorum to be sustained. That is, a strict majority must agree on the current leader, and absenteeism counts as a negative. In other words, if there are three registered members (voters), at least two out of the three must be actively asserting that the current master is the master. If any two disagree or even fail to answer, the etcd database will lock itself until quorum is again achieved in order to protect itself and the integrity of the data. This is fantastically important for handling distributed systems and the various types of contention which may arise.

This design means, however, that having an incorrect number of members can be devastating. Having only two controlplane nodes, for instance, is mostly worse than having only one, because if either goes down, your entire database will lock. You would be better off just making periodic snapshots of the data and restoring it when necessary.

Another common situation occurs when replacing controlplane nodes. If you have three controlplane nodes and replace one, you will not have three members, you will have four, and one of those will never be available again. Thus, if any of your three remaining nodes goes down, your database will lock, because only two out of the four members will be available: four nodes is worse than three nodes! So it is critical that controlplane members which are replaced be removed. Luckily, the Talos API makes this easy.

Bootstrap once

In the old days, Talos Linux had the idea of an init node. The init node was a “special” controlplane node which was designated as the founder of the cluster. It was the first, was guaranteed to be the elector, and was authorised to create a cluster… even if one already existed. This made the formation of a cluster cluster really easy, but it had a lot of down sides. Mostly, these related to rebuilding or replacing that init node: you could easily end up with a split-brain scenario in which you had two different clusters: a single node one and a two-node one. Needless to say, this was an unhappy arrangement.

Fortunately, init nodes are gone, but that means that the critical operation of forming a cluster is a manual process. It’s an easy process, consisting of a single API call, but it can be a confusing one, until you understand what it does.

Every new cluster must be bootstrapped exactly and only once. This means you do NOT bootstrap each node in a cluster, not even each controlplane node. You bootstrap only a single controlplane node, because you are bootstrapping the cluster, not the node.

It doesn’t matter which controlplane node is told to bootstrap, but it must be a controlplane node, and it must be only one.

Bootstrapping is fast and sure. Even if your Kubernetes cluster fails to form for other reasons (say, a bad configuration option or unavailable container repository), if the bootstrap API call returns successfully, you do NOT need to bootstrap again: just fix the config or let Kubernetes retry.

Bootstrapping itself does not do anything with Kubernetes. Bootstrapping only tells etcd to form a cluster, so don’t judge the success of a bootstrap by the failure of Kubernetes to start. Kubernetes relies on etcd, so bootstrapping is required, but it is not sufficient for Kubernetes to start.

Last modified May 11, 2022: docs: fix typos, edited for clarity (9885bbe17)