1 - Advanced Networking

How to configure advanced networking options on Talos Linux.

Static Addressing

Static addressing is comprised of specifying addresses, routes ( remember to add your default gateway ), and interface. Most likely you’ll also want to define the nameservers so you have properly functioning DNS.

machine:
  network:
    hostname: talos
    nameservers:
      - 10.0.0.1
    interfaces:
      - interface: eth0
        addresses:
          - 10.0.0.201/8
        mtu: 8765
        routes:
          - network: 0.0.0.0/0
            gateway: 10.0.0.1
      - interface: eth1
        ignore: true
  time:
    servers:
      - time.cloudflare.com

Additional Addresses for an Interface

In some environments you may need to set additional addresses on an interface. In the following example, we set two additional addresses on the loopback interface.

machine:
  network:
    interfaces:
      - interface: lo
        addresses:
          - 192.168.0.21/24
          - 10.2.2.2/24

Bonding

The following example shows how to create a bonded interface.

machine:
  network:
    interfaces:
      - interface: bond0
        dhcp: true
        bond:
          mode: 802.3ad
          lacpRate: fast
          xmitHashPolicy: layer3+4
          miimon: 100
          updelay: 200
          downdelay: 200
          interfaces:
            - eth0
            - eth1

Setting Up a Bridge

The following example shows how to set up a bridge between two interfaces with an assigned static address.

machine:
  network:
    interfaces:
      - interface: br0
        addresses:
          - 192.168.0.42/24
        bridge:
          stp:
            enabled: true
          interfaces:
              - eth0
              - eth1

VLANs

To setup vlans on a specific device use an array of VLANs to add. The master device may be configured without addressing by setting dhcp to false.

machine:
  network:
    interfaces:
      - interface: eth0
        dhcp: false
        vlans:
          - vlanId: 100
            addresses:
              - "192.168.2.10/28"
            routes:
              - network: 0.0.0.0/0
                gateway: 192.168.2.1

2 - Air-gapped Environments

Setting up Talos Linux to work in environments with no internet access.

In this guide we will create a Talos cluster running in an air-gapped environment with all the required images being pulled from an internal registry. We will use the QEMU provisioner available in talosctl to create a local cluster, but the same approach could be used to deploy Talos in bigger air-gapped networks.

Requirements

The follow are requirements for this guide:

  • Docker 18.03 or greater
  • Requirements for the Talos QEMU cluster

Identifying Images

In air-gapped environments, access to the public Internet is restricted, so Talos can’t pull images from public Docker registries (docker.io, ghcr.io, etc.) We need to identify the images required to install and run Talos. The same strategy can be used for images required by custom workloads running on the cluster.

The talosctl images command provides a list of default images used by the Talos cluster (with default configuration settings). To print the list of images, run:

talosctl images

This list contains images required by a default deployment of Talos. There might be additional images required for the workloads running on this cluster, and those should be added to this list.

Preparing the Internal Registry

As access to the public registries is restricted, we have to run an internal Docker registry. In this guide, we will launch the registry on the same machine using Docker:

$ docker run -d -p 6000:5000 --restart always --name registry-airgapped registry:2
1bf09802bee1476bc463d972c686f90a64640d87dacce1ac8485585de69c91a5

This registry will be accepting connections on port 6000 on the host IPs. The registry is empty by default, so we have fill it with the images required by Talos.

First, we pull all the images to our local Docker daemon:

$ for image in `talosctl images`; do docker pull $image; done
v0.15.1: Pulling from coreos/flannel
Digest: sha256:9a296fbb67790659adc3701e287adde3c59803b7fcefe354f1fc482840cdb3d9
...

All images are now stored in the Docker daemon store:

$ docker images
REPOSITORY                               TAG                                        IMAGE ID       CREATED         SIZE
gcr.io/etcd-development/etcd             v3.5.3                                     604d4f022632   6 days ago      181MB
ghcr.io/siderolabs/install-cni           v1.0.0-2-gc5d3ab0                          4729e54f794d   6 days ago      76MB
...

Now we need to re-tag them so that we can push them to our local registry. We are going to replace the first component of the image name (before the first slash) with our registry endpoint 127.0.0.1:6000:

$ for image in `talosctl images`; do \
    docker tag $image `echo $image | sed -E 's#^[^/]+/#127.0.0.1:6000/#'`; \
  done

As the next step, we push images to the internal registry:

$ for image in `talosctl images`; do \
    docker push `echo $image | sed -E 's#^[^/]+/#127.0.0.1:6000/#'`; \
  done

We can now verify that the images are pushed to the registry:

$ curl http://127.0.0.1:6000/v2/_catalog
{"repositories":["coredns/coredns","coreos/flannel","etcd-development/etcd","kube-apiserver","kube-controller-manager","kube-proxy","kube-scheduler","pause","siderolabs/install-cni","siderolabs/installer","siderolabs/kubelet"]}

Note: images in the registry don’t have the registry endpoint prefix anymore.

Launching Talos in an Air-gapped Environment

For Talos to use the internal registry, we use the registry mirror feature to redirect all image pull requests to the internal registry. This means that the registry endpoint (as the first component of the image reference) gets ignored, and all pull requests are sent directly to the specified endpoint.

We are going to use a QEMU-based Talos cluster for this guide, but the same approach works with Docker-based clusters as well. As QEMU-based clusters go through the Talos install process, they can be used better to model a real air-gapped environment.

Identify all registry prefixes from talosctl images, for example:

  • docker.io
  • gcr.io
  • ghcr.io
  • k8s.gcr.io
  • quay.io

The talosctl cluster create command provides conveniences for common configuration options. The only required flag for this guide is --registry-mirror <endpoint>=http://10.5.0.1:6000 which redirects every pull request to the internal registry, this flag needs to be repeated for each of the identified registry prefixes above. The endpoint being used is 10.5.0.1, as this is the default bridge interface address which will be routable from the QEMU VMs (127.0.0.1 IP will be pointing to the VM itself).

$ sudo --preserve-env=HOME talosctl cluster create --provisioner=qemu --install-image=ghcr.io/siderolabs/installer:v1.2.0-alpha.0 \
  --registry-mirror docker.io=http://10.5.0.1:6000 \
  --registry-mirror gcr.io=http://10.5.0.1:6000 \
  --registry-mirror ghcr.io=http://10.5.0.1:6000 \
  --registry-mirror k8s.gcr.io=http://10.5.0.1:6000 \
  --registry-mirror quay.io=http://10.5.0.1:6000
validating CIDR and reserving IPs
generating PKI and tokens
creating state directory in "/home/user/.talos/clusters/talos-default"
creating network talos-default
creating load balancer
creating dhcpd
creating master nodes
creating worker nodes
waiting for API
...

Note: --install-image should match the image which was copied into the internal registry in the previous step.

You can be verify that the cluster is air-gapped by inspecting the registry logs: docker logs -f registry-airgapped.

Closing Notes

Running in an air-gapped environment might require additional configuration changes, for example using custom settings for DNS and NTP servers.

When scaling this guide to the bare-metal environment, following Talos config snippet could be used as an equivalent of the --registry-mirror flag above:

machine:
  ...
  registries:
      mirrors:
        docker.io:
          endpoints:
          - http://10.5.0.1:6000/
        gcr.io:
          endpoints:
          - http://10.5.0.1:6000/
        ghcr.io:
          endpoints:
          - http://10.5.0.1:6000/
        k8s.gcr.io:
          endpoints:
          - http://10.5.0.1:6000/
        quay.io:
          endpoints:
          - http://10.5.0.1:6000/
...

Other implementations of Docker registry can be used in place of the Docker registry image used above to run the registry. If required, auth can be configured for the internal registry (and custom TLS certificates if needed).

3 - Customizing the Kernel

Guide on how to customize the kernel used by Talos Linux.

The installer image contains ONBUILD instructions that handle the following:

  • the decompression, and unpacking of the initramfs.xz
  • the unsquashing of the rootfs
  • the copying of new rootfs files
  • the squashing of the new rootfs
  • and the packing, and compression of the new initramfs.xz

When used as a base image, the installer will perform the above steps automatically with the requirement that a customization stage be defined in the Dockerfile.

Build and push your own kernel:

git clone https://github.com/talos-systems/pkgs.git
cd pkgs
make kernel-menuconfig USERNAME=_your_github_user_name_

docker login ghcr.io --username _your_github_user_name_
make kernel USERNAME=_your_github_user_name_ PUSH=true

Using a multi-stage Dockerfile we can define the customization stage and build FROM the installer image:

FROM scratch AS customization
COPY --from=<custom kernel image> /lib/modules /lib/modules

FROM ghcr.io/siderolabs/installer:latest
COPY --from=<custom kernel image> /boot/vmlinuz /usr/install/${TARGETARCH}/vmlinuz

When building the image, the customization stage will automatically be copied into the rootfs. The customization stage is not limited to a single COPY instruction. In fact, you can do whatever you would like in this stage, but keep in mind that everything in / will be copied into the rootfs.

To build the image, run:

DOCKER_BUILDKIT=0 docker build --build-arg RM="/lib/modules" -t installer:kernel .

Note: buildkit has a bug #816, to disable it use DOCKER_BUILDKIT=0

Now that we have a custom installer we can build Talos for the specific platform we wish to deploy to.

4 - Customizing the Root Filesystem

How to add your own content to the immutable root file system of Talos Linux.

The installer image contains ONBUILD instructions that handle the following:

  • the decompression, and unpacking of the initramfs.xz
  • the unsquashing of the rootfs
  • the copying of new rootfs files
  • the squashing of the new rootfs
  • and the packing, and compression of the new initramfs.xz

When used as a base image, the installer will perform the above steps automatically with the requirement that a customization stage be defined in the Dockerfile.

For example, say we have an image that contains the contents of a library we wish to add to the Talos rootfs. We need to define a stage with the name customization:

FROM scratch AS customization
COPY --from=<name|index> <src> <dest>

Using a multi-stage Dockerfile we can define the customization stage and build FROM the installer image:

FROM scratch AS customization
COPY --from=<name|index> <src> <dest>

FROM ghcr.io/siderolabs/installer:latest

When building the image, the customization stage will automatically be copied into the rootfs. The customization stage is not limited to a single COPY instruction. In fact, you can do whatever you would like in this stage, but keep in mind that everything in / will be copied into the rootfs.

Note: <dest> is the path relative to the rootfs that you wish to place the contents of <src>.

To build the image, run:

docker build --squash -t <organization>/installer:latest .

In the case that you need to perform some cleanup before adding additional files to the rootfs, you can specify the RM build-time variable:

docker build --squash --build-arg RM="[<path> ...]" -t <organization>/installer:latest .

This will perform a rm -rf on the specified paths relative to the rootfs.

Note: RM must be a whitespace delimited list.

The resulting image can be used to:

  • generate an image for any of the supported providers
  • perform bare-metall installs
  • perform upgrades

We will step through common customizations in the remainder of this section.

5 - Developing Talos

Learn how to set up a development environment for local testing and hacking on Talos itself!

This guide outlines steps and tricks to develop Talos operating systems and related components. The guide assumes Linux operating system on the development host. Some steps might work under Mac OS X, but using Linux is highly advised.

Prepare

Check out the Talos repository.

Try running make help to see available make commands. You would need Docker and buildx installed on the host.

Note: Usually it is better to install up to date Docker from Docker apt repositories, e.g. Ubuntu instructions.

If buildx plugin is not available with OS docker packages, it can be installed as a plugin from GitHub releases.

Set up a builder with access to the host network:

 docker buildx create --driver docker-container  --driver-opt network=host --name local1 --buildkitd-flags '--allow-insecure-entitlement security.insecure' --use

Note: network=host allows buildx builder to access host network, so that it can push to a local container registry (see below).

Make sure the following steps work:

  • make talosctl
  • make initramfs kernel

Set up a local docker registry:

docker run -d -p 5005:5000 \
    --restart always \
    --name local registry:2

Try to build and push to local registry an installer image:

make installer IMAGE_REGISTRY=127.0.0.1:5005 PUSH=true

Record the image name output in the step above.

Note: it is also possible to force a stable image tag by using TAG variable: make installer IMAGE_REGISTRY=127.0.0.1:5005 TAG=v1.0.0-alpha.1 PUSH=true.

Running Talos cluster

Set up local caching docker registries (this speeds up Talos cluster boot a lot), script is in the Talos repo:

bash hack/start-registry-proxies.sh

Start your local cluster with:

sudo --preserve-env=HOME _out/talosctl-linux-amd64 cluster create \
    --provisioner=qemu \
    --cidr=172.20.0.0/24 \
    --registry-mirror docker.io=http://172.20.0.1:5000 \
    --registry-mirror k8s.gcr.io=http://172.20.0.1:5001  \
    --registry-mirror quay.io=http://172.20.0.1:5002 \
    --registry-mirror gcr.io=http://172.20.0.1:5003 \
    --registry-mirror ghcr.io=http://172.20.0.1:5004 \
    --registry-mirror 127.0.0.1:5005=http://172.20.0.1:5005 \
    --install-image=127.0.0.1:5005/siderolabs/installer:<RECORDED HASH from the build step> \
    --masters 3 \
    --workers 2 \
    --with-bootloader=false
  • --provisioner selects QEMU vs. default Docker
  • custom --cidr to make QEMU cluster use different network than default Docker setup (optional)
  • --registry-mirror uses the caching proxies set up above to speed up boot time a lot, last one adds your local registry (installer image was pushed to it)
  • --install-image is the image you built with make installer above
  • --masters & --workers configure cluster size, choose to match your resources; 3 masters give you HA control plane; 1 master is enough, never do 2 masters
  • --with-bootloader=false disables boot from disk (Talos will always boot from _out/vmlinuz-amd64 and _out/initramfs-amd64.xz). This speeds up development cycle a lot - no need to rebuild installer and perform install, rebooting is enough to get new code.

Note: as boot loader is not used, it’s not necessary to rebuild installer each time (old image is fine), but sometimes it’s needed (when configuration changes are done and old installer doesn’t validate the config).

talosctl cluster create derives Talos machine configuration version from the install image tag, so sometimes early in the development cycle (when new minor tag is not released yet), machine config version can be overridden with --talos-version=v1.2.

If the --with-bootloader=false flag is not enabled, for Talos cluster to pick up new changes to the code (in initramfs), it will require a Talos upgrade (so new installer should be built). With --with-bootloader=false flag, Talos always boots from initramfs in _out/ directory, so simple reboot is enough to pick up new code changes.

If the installation flow needs to be tested, --with-bootloader=false shouldn’t be used.

Console Logs

Watching console logs is easy with tail:

tail -F ~/.talos/clusters/talos-default/talos-default-*.log

Interacting with Talos

Once talosctl cluster create finishes successfully, talosconfig and kubeconfig will be set up automatically to point to your cluster.

Start playing with talosctl:

talosctl -n 172.20.0.2 version
talosctl -n 172.20.0.3,172.20.0.4 dashboard
talosctl -n 172.20.0.4 get members

Same with kubectl:

kubectl get nodes -o wide

You can deploy some Kubernetes workloads to the cluster.

You can edit machine config on the fly with talosctl edit mc --immediate, config patches can be applied via --config-patch flags, also many features have specific flags in talosctl cluster create.

Quick Reboot

To reboot whole cluster quickly (e.g. to pick up a change made in the code):

for socket in ~/.talos/clusters/talos-default/talos-default-*.monitor; do echo "q" | sudo socat - unix-connect:$socket; done

Sending q to a single socket allows to reboot a single node.

Note: This command performs immediate reboot (as if the machine was powered down and immediately powered back up), for normal Talos reboot use talosctl reboot.

Development Cycle

Fast development cycle:

  • bring up a cluster
  • make code changes
  • rebuild initramfs with make initramfs
  • reboot a node to pick new initramfs
  • verify code changes
  • more code changes…

Some aspects of Talos development require to enable bootloader (when working on installer itself), in that case quick development cycle is no longer possible, and cluster should be destroyed and recreated each time.

Running Integration Tests

If integration tests were changed (or when running them for the first time), first rebuild the integration test binary:

rm -f  _out/integration-test-linux-amd64; make _out/integration-test-linux-amd64

Running short tests against QEMU provisioned cluster:

_out/integration-test-linux-amd64 \
    -talos.provisioner=qemu \
    -test.v \
    -talos.crashdump=false \
    -test.short \
    -talos.talosctlpath=$PWD/_out/talosctl-linux-amd64

Whole test suite can be run removing -test.short flag.

Specfic tests can be run with -test.run=TestIntegration/api.ResetSuite.

Build Flavors

make <something> WITH_RACE=1 enables Go race detector, Talos runs slower and uses more memory, but memory races are detected.

make <something> WITH_DEBUG=1 enables Go profiling and other debug features, useful for local development.

Destroying Cluster

sudo --preserve-env=HOME ../talos/_out/talosctl-linux-amd64 cluster destroy --provisioner=qemu

This command stops QEMU and helper processes, tears down bridged network on the host, and cleans up cluster state in ~/.talos/clusters.

Note: if the host machine is rebooted, QEMU instances and helpers processes won’t be started back. In that case it’s required to clean up files in ~/.talos/clusters/<cluster-name> directory manually.

Optional

Set up cross-build environment with:

docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

Note: the static qemu binaries which come with Ubuntu 21.10 seem to be broken.

Unit tests

Unit tests can be run in buildx with make unit-tests, on Ubuntu systems some tests using loop devices will fail because Ubuntu uses low-index loop devices for snaps.

Most of the unit-tests can be run standalone as well, with regular go test, or using IDE integration:

go test -v ./internal/pkg/circular/

This provides much faster feedback loop, but some tests require either elevated privileges (running as root) or additional binaries available only in Talos rootfs (containerd tests).

Running tests as root can be done with -exec flag to go test, but this is risky, as test code has root access and can potentially make undesired changes:

go test -exec sudo  -v ./internal/app/machined/pkg/controllers/network/...

Go Profiling

Build initramfs with debug enabled: make initramfs WITH_DEBUG=1.

Launch Talos cluster with bootloader disabled, and use go tool pprof to capture the profile and show the output in your browser:

go tool pprof http://172.20.0.2:9982/debug/pprof/heap

The IP address 172.20.0.2 is the address of the Talos node, and port :9982 depends on the Go application to profile:

  • 9981: apid
  • 9982: machined
  • 9983: trustd

Testing Air-gapped Environments

There is a hidden talosctl debug air-gapped command which launches two components:

  • HTTP proxy capable of proxying HTTP and HTTPS requests
  • HTTPS server with a self-signed certificate

The command also writes down Talos machine configuration patch to enable the HTTP proxy and add a self-signed certificate to the list of trusted certificates:

$ talosctl debug air-gapped --advertised-address 172.20.0.1
2022/08/04 16:43:14 writing config patch to air-gapped-patch.yaml
2022/08/04 16:43:14 starting HTTP proxy on :8002
2022/08/04 16:43:14 starting HTTPS server with self-signed cert on :8001

The --advertised-address should match the bridge IP of the Talos node.

Generated machine configuration patch looks like:

machine:
    files:
        - content: |
            -----BEGIN CERTIFICATE-----
            MIIBijCCAS+gAwIBAgIBATAKBggqhkjOPQQDAjAUMRIwEAYDVQQKEwlUZXN0IE9u
            bHkwHhcNMjIwODA0MTI0MzE0WhcNMjIwODA1MTI0MzE0WjAUMRIwEAYDVQQKEwlU
            ZXN0IE9ubHkwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAAQfOJdaOFSOI1I+EeP1
            RlMpsDZJaXjFdoo5zYM5VYs3UkLyTAXAmdTi7JodydgLhty0pwLEWG4NUQAEvip6
            EmzTo3IwcDAOBgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsG
            AQUFBwMCMA8GA1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFCwxL+BjG0pDwaH8QgKW
            Ex0J2mVXMA8GA1UdEQQIMAaHBKwUAAEwCgYIKoZIzj0EAwIDSQAwRgIhAJoW0z0D
            JwpjFcgCmj4zT1SbBFhRBUX64PHJpAE8J+LgAiEAvfozZG8Or6hL21+Xuf1x9oh4
            /4Hx3jozbSjgDyHOLk4=
            -----END CERTIFICATE-----            
          permissions: 0o644
          path: /etc/ssl/certs/ca-certificates
          op: append
    env:
        http_proxy: http://172.20.0.1:8002
        https_proxy: http://172.20.0.1:8002
        no_proxy: 172.20.0.1/24
cluster:
    extraManifests:
        - https://172.20.0.1:8001/debug.yaml

The first section appends a self-signed certificate of the HTTPS server to the list of trusted certificates, followed by the HTTP proxy setup (in-cluster traffic is excluded from the proxy). The last section adds an extra Kubernetes manifest hosted on the HTTPS server.

The machine configuration patch can now be used to launch a test Talos cluster:

talosctl cluster create ... --config-patch @air-gapped-patch.yaml

The following lines should appear in the output of the talosctl debug air-gapped command:

  • CONNECT discovery.talos.dev:443: the HTTP proxy is used to talk to the discovery service
  • http: TLS handshake error from 172.20.0.2:53512: remote error: tls: bad certificate: an expected error on Talos side, as self-signed cert is not written yet to the file
  • GET /debug.yaml: Talos successfully fetches the extra manifest successfully

There might be more output depending on the registry caches being used or not.

6 - Disaster Recovery

Procedure for snapshotting etcd database and recovering from catastrophic control plane failure.

etcd database backs Kubernetes control plane state, so if the etcd service is unavailable Kubernetes control plane goes down, and the cluster is not recoverable until etcd is recovered with contents. The etcd consistency model builds around the consensus protocol Raft, so for highly-available control plane clusters, loss of one control plane node doesn’t impact cluster health. In general, etcd stays up as long as a sufficient number of nodes to maintain quorum are up. For a three control plane node Talos cluster, this means that the cluster tolerates a failure of any single node, but losing more than one node at the same time leads to complete loss of service. Because of that, it is important to take routine backups of etcd state to have a snapshot to recover cluster from in case of catastrophic failure.

Backup

Snapshotting etcd Database

Create a consistent snapshot of etcd database with talosctl etcd snapshot command:

$ talosctl -n <IP> etcd snapshot db.snapshot
etcd snapshot saved to "db.snapshot" (2015264 bytes)
snapshot info: hash c25fd181, revision 4193, total keys 1287, total size 3035136

Note: filename db.snapshot is arbitrary.

This database snapshot can be taken on any healthy control plane node (with IP address <IP> in the example above), as all etcd instances contain exactly same data. It is recommended to configure etcd snapshots to be created on some schedule to allow point-in-time recovery using the latest snapshot.

Disaster Database Snapshot

If etcd cluster is not healthy, the talosctl etcd snapshot command might fail. In that case, copy the database snapshot directly from the control plane node:

talosctl -n <IP> cp /var/lib/etcd/member/snap/db .

This snapshot might not be fully consistent (if the etcd process is running), but it allows for disaster recovery when latest regular snapshot is not available.

Machine Configuration

Machine configuration might be required to recover the node after hardware failure. Backup Talos node machine configuration with the command:

talosctl -n IP get mc v1alpha1 -o yaml | yq eval '.spec' -

Recovery

Before starting a disaster recovery procedure, make sure that etcd cluster can’t be recovered:

  • get etcd cluster member list on all healthy control plane nodes with talosctl -n IP etcd members command and compare across all members.
  • query etcd health across control plane nodes with talosctl -n IP service etcd.

If the quorum can be restored, restoring quorum might be a better strategy than performing full disaster recovery procedure.

Latest Etcd Snapshot

Get hold of the latest etcd database snapshot. If a snapshot is not fresh enough, create a database snapshot (see above), even if the etcd cluster is unhealthy.

Init Node

Make sure that there are no control plane nodes with machine type init:

$ talosctl -n <IP1>,<IP2>,... get machinetype
NODE         NAMESPACE   TYPE          ID             VERSION   TYPE
172.20.0.2   config      MachineType   machine-type   2         controlplane
172.20.0.4   config      MachineType   machine-type   2         controlplane
172.20.0.3   config      MachineType   machine-type   2         controlplane

Nodes with init type are incompatible with etcd recovery procedure. init node can be converted to controlplane type with talosctl edit mc --mode=staged command followed by node reboot with talosctl reboot command.

Preparing Control Plane Nodes

If some control plane nodes experienced hardware failure, replace them with new nodes. Use machine configuration backup to re-create the nodes with the same secret material and control plane settings to allow workers to join the recovered control plane.

If a control plane node is healthy but etcd isn’t, wipe the node’s EPHEMERAL partition to remove the etcd data directory (make sure a database snapshot is taken before doing this):

talosctl -n <IP> reset --graceful=false --reboot --system-labels-to-wipe=EPHEMERAL

At this point, all control plane nodes should boot up, and etcd service should be in the Preparing state.

Kubernetes control plane endpoint should be pointed to the new control plane nodes if there were any changes to the node addresses.

Recovering from the Backup

Make sure all etcd service instances are in Preparing state:

$ talosctl -n <IP> service etcd
NODE     172.20.0.2
ID       etcd
STATE    Preparing
HEALTH   ?
EVENTS   [Preparing]: Running pre state (17s ago)
         [Waiting]: Waiting for service "cri" to be "up", time sync (18s ago)
         [Waiting]: Waiting for service "cri" to be "up", service "networkd" to be "up", time sync (20s ago)

Execute the bootstrap command against any control plane node passing the path to the etcd database snapshot:

$ talosctl -n <IP> bootstrap --recover-from=./db.snapshot
recovering from snapshot "./db.snapshot": hash c25fd181, revision 4193, total keys 1287, total size 3035136

Note: if database snapshot was copied out directly from the etcd data directory using talosctl cp, add flag --recover-skip-hash-check to skip integrity check on restore.

Talos node should print matching information in the kernel log:

recovering etcd from snapshot: hash c25fd181, revision 4193, total keys 1287, total size 3035136
{"level":"info","msg":"restoring snapshot","path":"/var/lib/etcd.snapshot","wal-dir":"/var/lib/etcd/member/wal","data-dir":"/var/lib/etcd","snap-dir":"/var/li}
{"level":"info","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":3360}
{"level":"info","msg":"added member","cluster-id":"a3390e43eb5274e2","local-member-id":"0","added-peer-id":"eb4f6f534361855e","added-peer-peer-urls":["https:/}
{"level":"info","msg":"restored snapshot","path":"/var/lib/etcd.snapshot","wal-dir":"/var/lib/etcd/member/wal","data-dir":"/var/lib/etcd","snap-dir":"/var/lib/etcd/member/snap"}

Now etcd service should become healthy on the bootstrap node, Kubernetes control plane components should start and control plane endpoint should become available. Remaining control plane nodes join etcd cluster once control plane endpoint is up.

Single Control Plane Node Cluster

This guide applies to the single control plane clusters as well. In fact, it is much more important to take regular snapshots of the etcd database in single control plane node case, as loss of the control plane node might render the whole cluster irrecoverable without a backup.

7 - Extension Services

Use extension services in Talos Linux.

Talos provides a way to run additional system services early in the Talos boot process. Extension services should be included into the Talos root filesystem (e.g. using system extensions). Extension services run as privileged containers with ephemeral root filesystem located in the Talos root filesystem.

Extension services can be used to use extend core features of Talos in a way that is not possible via static pods or Kubernetes DaemonSets.

Potential extension services use-cases:

  • storage: Open iSCSI, software RAID, etc.
  • networking: BGP FRR, etc.
  • platform integration: VMWare open VM tools, etc.

Configuration

Talos on boot scans directory /usr/local/etc/containers for *.yaml files describing the extension services to run. Format of the extension service config:

name: hello-world
container:
  entrypoint: ./hello-world
  args:
     - -f
  mounts:
     - # OCI Mount Spec
depends:
   - service: cri
   - path: /run/machined/machined.sock
   - network:
       - addresses
       - connectivity
       - hostname
       - etcfiles
   - time: true
restart: never|always|untilSuccess

name

Field name sets the service name, valid names are [a-z0-9-_]+. The service container root filesystem path is derived from the name: /usr/local/lib/containers/<name>. The extension service will be registered as a Talos service under an ext-<name> identifier.

container

  • entrypoint defines the container entrypoint relative to the container root filesystem (/usr/local/lib/containers/<name>)
  • args defines the additional arguments to pass to the entrypoint
  • mounts defines the volumes to be mounted into the container root

container.mounts

The section mounts uses the standard OCI spec:

- source: /var/log/audit
  destination: /var/log/audit
  type: bind
  options:
    - rshared
    - bind
    - ro

All requested directories will be mounted into the extension service container mount namespace. If the source directory doesn’t exist in the host filesystem, it will be created (only for writable paths in the Talos root filesystem).

container.security

The section security follows this example:

maskedPaths:
  - "/should/be/masked"
readonlyPaths:
  - "/path/that/should/be/readonly"
  - "/another/readonly/path"
writeableRootfs: true
writeableSysfs: true
  • The rootfs is readonly by default unless writeableRootfs: true is set.
  • The sysfs is readonly by default unless writeableSysfs: true is set.
  • Masked paths if not set defaults to containerd defaults. Masked paths will be mounted to /dev/null. To set empty masked paths use:
container:
  security:
    maskedPaths: []
  • Read Only paths if not set defaults to containerd defaults. Read-only paths will be mounted to /dev/null. To set empty read only paths use:
container:
  security:
    readonlyPaths: []

depends

The depends section describes extension service start dependencies: the service will not be started until all dependencies are met.

Available dependencies:

  • service: <name>: wait for the service <name> to be running and healthy
  • path: <path>: wait for the <path> to exist
  • network: [addresses, connectivity, hostname, etcfiles]: wait for the specified network readiness checks to succeed
  • time: true: wait for the NTP time sync

restart

Field restart defines the service restart policy, it allows to either configure an always running service or a one-shot service:

  • always: restart service always
  • never: start service only once and never restart
  • untilSuccess: restart failing service, stop restarting on successful run

Example

Example layout of the Talos root filesystem contents for the extension service:

/
└── usr
    └── local
        ├── etc
        │   └── containers
        │       └── hello-world.yaml
        └── lib
            └── containers
                └── hello-world
                    ├── hello
                    └── config.ini

Talos discovers the extension service configuration in /usr/local/etc/containers/hello-world.yaml:

name: hello-world
container:
  entrypoint: ./hello
  args:
    - --config
    - config.ini
depends:
  - network:
    - addresses
restart: always

Talos starts the container for the extension service with container root filesystem at /usr/local/lib/containers/hello-world:

/
├── hello
└── config.ini

Extension service is registered as ext-hello-world in talosctl services:

$ talosctl service ext-hello-world
NODE     172.20.0.5
ID       ext-hello-world
STATE    Running
HEALTH   ?
EVENTS   [Running]: Started task ext-hello-world (PID 1100) for container ext-hello-world (2m47s ago)
         [Preparing]: Creating service runner (2m47s ago)
         [Preparing]: Running pre state (2m47s ago)
         [Waiting]: Waiting for service "containerd" to be "up" (2m48s ago)
         [Waiting]: Waiting for service "containerd" to be "up", network (2m49s ago)

An extension service can be started, restarted and stopped using talosctl service ext-hello-world start|restart|stop. Use talosctl logs ext-hello-world to get the logs of the service.

Complete example of the extension service can be found in the extensions repository.

8 - Migrating from Kubeadm

Migrating Kubeadm-based clusters to Talos.

It is possible to migrate Talos from a cluster that is created using kubeadm to Talos.

High-level steps are the following:

  1. Collect CA certificates and a bootstrap token from a control plane node.
  2. Create a Talos machine config with the CA certificates with the ones you collected.
  3. Update control plane endpoint in the machine config to point to the existing control plane (i.e. your load balancer address).
  4. Boot a new Talos machine and apply the machine config.
  5. Verify that the new control plane node is ready.
  6. Remove one of the old control plane nodes.
  7. Repeat the same steps for all control plane nodes.
  8. Verify that all control plane nodes are ready.
  9. Repeat the same steps for all worker nodes, using the machine config generated for the workers.

Remarks on kube-apiserver load balancer

While migrating to Talos, you need to make sure that your kube-apiserver load balancer is in place and keeps pointing to the correct set of control plane nodes.

This process depends on your load balancer setup.

If you are using an LB that is external to the control plane nodes (e.g. cloud provider LB, F5 BIG-IP, etc.), you need to make sure that you update the backend IPs of the load balancer to point to the control plane nodes as you add Talos nodes and remove kubeadm-based ones.

If your load balancing is done on the control plane nodes (e.g. keepalived + haproxy on the control plane nodes), you can do the following:

  1. Add Talos nodes and remove kubeadm-based ones while updating the haproxy backends to point to the newly added nodes except the last kubeadm-based control plane node.
  2. Turn off keepalived to drop the virtual IP used by the kubeadm-based nodes (introduces kube-apiserver downtime).
  3. Set up a virtual-IP based new load balancer on the new set of Talos control plane nodes. Use the previous LB IP as the LB virtual IP.
  4. Verify apiserver connectivity over the Talos-managed virtual IP.
  5. Migrate the last control-plane node.

Prerequisites

  • Admin access to the kubeadm-based cluster
  • Access to the /etc/kubernetes/pki directory (e.g. SSH & root permissions) on the control plane nodes of the kubeadm-based cluster
  • Access to kube-apiserver load-balancer configuration

Step-by-step guide

  1. Download /etc/kubernetes/pki directory from a control plane node of the kubeadm-based cluster.

  2. Create a new join token for the new control plane nodes:

    # inside a control plane node
    kubeadm token create
    
  3. Create Talos secrets from the PKI directory you downloaded on step 1 and the token you generated on step 2:

    talosctl gen secrets --kubernetes-bootstrap-token <TOKEN> --from-kubernetes-pki <PKI_DIR>
    
  4. Create a new Talos config from the secrets:

    talosctl gen config --with-secrets secrets.yaml <CLUSTER_NAME> https://<EXISTING_CLUSTER_LB_IP>
    
  5. Collect the information about the kubeadm-based cluster from the kubeadm configmap:

    kubectl get configmap -n kube-system kubeadm-config -oyaml
    

    Take note of the following information in the ClusterConfiguration:

    • .controlPlaneEndpoint
    • .networking.dnsDomain
    • .networking.podSubnet
    • .networking.serviceSubnet
  6. Replace the following information in the generated controlplane.yaml:

    • .cluster.network.cni.name with none
    • .cluster.network.podSubnets[0] with the value of the networking.podSubnet from the previous step
    • .cluster.network.serviceSubnets[0] with the value of the networking.serviceSubnet from the previous step
    • .cluster.network.dnsDomain with the value of the networking.dnsDomain from the previous step
  7. Go through the rest of controlplane.yaml and worker.yaml to customize them according to your needs.

  8. Bring up a Talos node to be the initial Talos control plane node.

  9. Apply the generated controlplane.yaml to the Talos control plane node:

    talosctl --nodes <TALOS_NODE_IP> apply-config --insecure --file controlplane.yaml
    
  10. Wait until the new control plane node joins the cluster and is ready.

    kubectl get node -owide --watch
    
  11. Update your load balancer to point to the new control plane node.

  12. Drain the old control plane node you are replacing:

    kubectl drain <OLD_NODE> --delete-emptydir-data --force --ignore-daemonsets --timeout=10m
    
  13. Remove the old control plane node from the cluster:

    kubectl delete node <OLD_NODE>
    
  14. Destroy the old node:

    # inside the node
    sudo kubeadm reset --force
    
  15. Repeat the same steps, starting from step 7, for all control plane nodes.

  16. Repeat the same steps, starting from step 7, for all worker nodes while applying the worker.yaml instead and skipping the LB step:

    talosctl --nodes <TALOS_NODE_IP> apply-config --insecure --file worker.yaml
    

9 - Proprietary Kernel Modules

Adding a proprietary kernel module to Talos Linux
  1. Patching and building the kernel image

    1. Clone the pkgs repository from Github and check out the revision corresponding to your version of Talos Linux

      git clone https://github.com/talos-systems/pkgs pkgs && cd pkgs
      git checkout v0.8.0
      
    2. Clone the Linux kernel and check out the revision that pkgs uses (this can be found in kernel/kernel-prepare/pkg.yaml and it will be something like the following: https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-x.xx.x.tar.xz)

      git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git && cd linux
      git checkout v5.15
      
    3. Your module will need to be converted to be in-tree. The steps for this are different depending on the complexity of the module to port, but generally it would involve moving the module source code into the drivers tree and creating a new Makefile and Kconfig.

    4. Stage your changes in Git with git add -A.

    5. Run git diff --cached --no-prefix > foobar.patch to generate a patch from your changes.

    6. Copy this patch to kernel/kernel/patches in the pkgs repo.

    7. Add a patch line in the prepare segment of kernel/kernel/pkg.yaml:

      patch -p0 < /pkg/patches/foobar.patch
      
    8. Build the kernel image. Make sure you are logged in to ghcr.io before running this command, and you can change or omit PLATFORM depending on what you want to target.

      make kernel PLATFORM=linux/amd64 USERNAME=your-username PUSH=true
      
    9. Make a note of the image name the make command outputs.

  2. Building the installer image

    1. Copy the following into a new Dockerfile:

      FROM scratch AS customization
      COPY --from=ghcr.io/your-username/kernel:<kernel version> /lib/modules /lib/modules
      
      FROM ghcr.io/siderolabs/installer:<talos version>
      COPY --from=ghcr.io/your-username/kernel:<kernel version> /boot/vmlinuz /usr/install/${TARGETARCH}/vmlinuz
      
    2. Run to build and push the installer:

      INSTALLER_VERSION=<talos version>
      IMAGE_NAME="ghcr.io/your-username/talos-installer:$INSTALLER_VERSION"
      DOCKER_BUILDKIT=0 docker build --build-arg RM="/lib/modules" -t "$IMAGE_NAME" . && docker push "$IMAGE_NAME"
      
  3. Deploying to your cluster

    talosctl upgrade --image ghcr.io/your-username/talos-installer:<talos version> --preserve=true
    

10 - Static Pods

Using Talos Linux to set up static pods in Kubernetes.

Static Pods

Static pods are run directly by the kubelet bypassing the Kubernetes API server checks and validations. Most of the time DaemonSet is a better alternative to static pods, but some workloads need to run before the Kubernetes API server is available or might need to bypass security restrictions imposed by the API server.

See Kubernetes documentation for more information on static pods.

Configuration

Static pod definitions are specified in the Talos machine configuration:

machine:
  pods:
    - apiVersion: v1
       kind: Pod
       metadata:
         name: nginx
       spec:
         containers:
           - name: nginx
             image: nginx

Talos renders static pod definitions to the kubelet manifest directory (/etc/kubernetes/manifests), kubelet picks up the definition and launches the pod.

Talos accepts changes to the static pod configuration without a reboot.

Usage

Kubelet mirrors pod definition to the API server state, so static pods can be inspected with kubectl get pods, logs can be retrieved with kubectl logs, etc.

$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
nginx-talos-default-master-2   1/1     Running   0          17s

If the API server is not available, status of the static pod can also be inspected with talosctl containers --kubernetes:

$ talosctl containers --kubernetes
NODE         NAMESPACE   ID                                                                                      IMAGE                                                         PID    STATUS
172.20.0.3   k8s.io      default/nginx-talos-default-master-2                                                    k8s.gcr.io/pause:3.6                                          4886   SANDBOX_READY
172.20.0.3   k8s.io      └─ default/nginx-talos-default-master-2:nginx                                           docker.io/library/nginx:latest
...

Logs of static pods can be retrieved with talosctl logs --kubernetes:

$ talosctl logs --kubernetes default/nginx-talos-default-master-2:nginx
172.20.0.3: 2022-02-10T15:26:01.289208227Z stderr F 2022/02/10 15:26:01 [notice] 1#1: using the "epoll" event method
172.20.0.3: 2022-02-10T15:26:01.2892466Z stderr F 2022/02/10 15:26:01 [notice] 1#1: nginx/1.21.6
172.20.0.3: 2022-02-10T15:26:01.28925723Z stderr F 2022/02/10 15:26:01 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)

Troubleshooting

Talos doesn’t perform any validation on the static pod definitions. If the pod isn’t running, use kubelet logs (talosctl logs kubelet) to find the problem:

$ talosctl logs kubelet
172.20.0.2: {"ts":1644505520281.427,"caller":"config/file.go:187","msg":"Could not process manifest file","path":"/etc/kubernetes/manifests/talos-default-nginx-gvisor.yaml","err":"invalid pod: [spec.containers: Required value]"}

Resource Definitions

Static pod definitions are available as StaticPod resources combined with Talos-generated control plane static pods:

$ talosctl get staticpods
NODE         NAMESPACE   TYPE        ID                        VERSION
172.20.0.3   k8s         StaticPod   default-nginx             1
172.20.0.3   k8s         StaticPod   kube-apiserver            1
172.20.0.3   k8s         StaticPod   kube-controller-manager   1
172.20.0.3   k8s         StaticPod   kube-scheduler            1

Talos assigns ID <namespace>-<name> to the static pods specified in the machine configuration.

On control plane nodes status of the running static pods is available in the StaticPodStatus resource:

$ talosctl get staticpodstatus
NODE         NAMESPACE   TYPE              ID                                                           VERSION   READY
172.20.0.3   k8s         StaticPodStatus   default/nginx-talos-default-master-2                         2         True
172.20.0.3   k8s         StaticPodStatus   kube-system/kube-apiserver-talos-default-master-2            2         True
172.20.0.3   k8s         StaticPodStatus   kube-system/kube-controller-manager-talos-default-master-2   3         True
172.20.0.3   k8s         StaticPodStatus   kube-system/kube-scheduler-talos-default-master-2            3         True

11 - Talos API access from Kubernetes

How to access Talos API from within Kubernetes.

In this guide, we will enable the Talos feature to access the Talos API from within Kubernetes.

Enabling the Feature

Edit the machine configuration to enable the feature, specifying the Kubernetes namespaces from which Talos API can be accessed and the allowed Talos API roles.

talosctl -n 172.20.0.2 edit machineconfig

Configure the kubernetesTalosAPIAccess like the following:

spec:
  machine:
    features:
      kubernetesTalosAPIAccess:
        enabled: true
        allowedRoles:
          - os:reader
        allowedKubernetesNamespaces:
          - default

Injecting Talos ServiceAccount into manifests

Create the following manifest file deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: talos-api-access
spec:
  selector:
    matchLabels:
      app: talos-api-access
  template:
    metadata:
      labels:
        app: talos-api-access
    spec:
      containers:
        - name: talos-api-access
          image: alpine:3
          command:
            - sh
            - -c
            - |
              wget -O /usr/local/bin/talosctl https://github.com/siderolabs/talos/releases/download/<talos version>/talosctl-linux-amd64
              chmod +x /usr/local/bin/talosctl
              while true; talosctl -n 172.20.0.2 version; do sleep 1; done              

Note: make sure that you replace the IP 172.20.0.2 with a valid Talos node IP.

Use talosctl inject serviceaccount command to inject the Talos ServiceAccount into the manifest.

talosctl inject serviceaccount -f deployment.yaml > deployment-injected.yaml

Inspect the generated manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  name: talos-api-access
spec:
  selector:
    matchLabels:
      app: talos-api-access
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: talos-api-access
    spec:
      containers:
      - command:
        - sh
        - -c
        - |
          wget -O /usr/local/bin/talosctl https://github.com/siderolabs/talos/releases/download/<talos version>/talosctl-linux-amd64
          chmod +x /usr/local/bin/talosctl
          while true; talosctl -n 172.20.0.2 version; do sleep 1; done          
        image: alpine:3
        name: talos-api-access
        resources: {}
        volumeMounts:
        - mountPath: /var/run/secrets/talos.dev
          name: talos-secrets
      tolerations:
      - operator: Exists
      volumes:
      - name: talos-secrets
        secret:
          secretName: talos-api-access-talos-secrets
status: {}
---
apiVersion: talos.dev/v1alpha1
kind: ServiceAccount
metadata:
    name: talos-api-access-talos-secrets
spec:
    roles:
        - os:reader
---

As you can notice, your deployment manifest is now injected with the Talos ServiceAccount.

Testing API Access

Apply the new manifest into default namespace:

kubectl apply -n default -f deployment-injected.yaml

Follow the logs of the pods belong to the deployment:

kubectl logs -n default -f -l app=talos-api-access

You’ll see a repeating output similar to the following:

Client:
    Tag:         <talos version>
    SHA:         ....
    Built:
    Go version:  go1.18.4
    OS/Arch:     linux/amd64
Server:
    NODE:        172.20.0.2
    Tag:         <talos version>
    SHA:         ...
    Built:
    Go version:  go1.18.4
    OS/Arch:     linux/amd64
    Enabled:     RBAC

This means that the pod can talk to Talos API of node 172.20.0.2 successfully.

12 - Troubleshooting Control Plane

Troubleshoot control plane failures for running cluster and bootstrap process.

This guide is written as series of topics and detailed answers for each topic. It starts with basics of control plane and goes into Talos specifics.

In this guide we assume that Talos client config is available and Talos API access is available. Kubernetes client configuration can be pulled from control plane nodes with talosctl -n <IP> kubeconfig (this command works before Kubernetes is fully booted).

What is a control plane node?

A control plane node is a node which:

  • runs etcd, the Kubernetes database
  • runs the Kubernetes control plane
    • kube-apiserver
    • kube-controller-manager
    • kube-scheduler
  • serves as an administrative proxy to the worker nodes

These nodes are critical to the operation of your cluster. Without control plane nodes, Kubernetes will not respond to changes in the system, and certain central services may not be available.

Talos nodes which have .machine.type of controlplane are control plane nodes.

Control plane nodes are tainted by default to prevent workloads from being scheduled to control plane nodes.

How many control plane nodes should be deployed?

Because control plane nodes are so important, it is important that they be deployed with redundancy to ensure consistent, reliable operation of the cluster during upgrades, reboots, hardware failures, and other such events. This is also known as high-availability or just HA. Non-HA clusters are sometimes used as test clusters, CI clusters, or in specific scenarios which warrant the loss of redundancy, but they should almost never be used in production.

Maintaining the proper count of control plane nodes is also critical. The etcd database operates on the principles of membership and quorum, so membership should always be an odd number, and there is exponentially-increasing overhead for each additional member. Therefore, the number of control plane nodes should almost always be 3. In some particularly large or distributed clusters, the count may be 5, but this is very rare.

See this document on the topic for more information.

What is the control plane endpoint?

The Kubernetes control plane endpoint is the single canonical URL by which the Kubernetes API is accessed. Especially with high-availability (HA) control planes, it is common that this endpoint may not point to the Kubernetes API server directly, but may be instead point to a load balancer or a DNS name which may have multiple A and AAAA records.

Like Talos’ own API, the Kubernetes API is constructed with mutual TLS, client certs, and a common Certificate Authority (CA). Unlike general-purpose websites, there is no need for an upstream CA, so tools such as cert-manager, services such as Let’s Encrypt, or purchased products such as validated TLS certificates are not required. Encryption, however, is, and hence the URL scheme will always be https://.

By default, the Kubernetes API server in Talos runs on port 6443. As such, the control plane endpoint URLs for Talos will almost always be of the form https://endpoint:6443, noting that the port, since it is not the https default of 443 is required. The endpoint above may be a DNS name or IP address, but it should be ultimately be directed to the set of all controlplane nodes, as opposed to a single one.

As mentioned above, this can be achieved by a number of strategies, including:

  • an external load balancer
  • DNS records
  • Talos-builtin shared IP (VIP)
  • BGP peering of a shared IP (such as with kube-vip)

Using a DNS name here is usually a good idea, it being the most flexible option, since it allows the combination with any other option, while offering a layer of abstraction. It allows the underlying IP addresses to change over time without impacting the canonical URL.

Unlike most services in Kubernetes, the API server runs with host networking, meaning that it shares the network namespace with the host. This means you can use the IP address(es) of the host to refer to the Kubernetes API server.

For availability of the API, it is important that any load balancer be aware of the health of the backend API servers. This makes a load balancer-based system valuable to minimize disruptions during common node lifecycle operations like reboots and upgrades.

It is critical that control plane endpoint works correctly during cluster bootstrap phase, as nodes discover each other using control plane endpoint.

kubelet is not running on control plane node

The kubelet service should be running on control plane nodes as soon as networking is configured:

$ talosctl -n <IP> service kubelet
NODE     172.20.0.2
ID       kubelet
STATE    Running
HEALTH   OK
EVENTS   [Running]: Health check successful (2m54s ago)
         [Running]: Health check failed: Get "http://127.0.0.1:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused (3m4s ago)
         [Running]: Started task kubelet (PID 2334) for container kubelet (3m6s ago)
         [Preparing]: Creating service runner (3m6s ago)
         [Preparing]: Running pre state (3m15s ago)
         [Waiting]: Waiting for service "timed" to be "up" (3m15s ago)
         [Waiting]: Waiting for service "cri" to be "up", service "timed" to be "up" (3m16s ago)
         [Waiting]: Waiting for service "cri" to be "up", service "networkd" to be "up", service "timed" to be "up" (3m18s ago)

If the kubelet is not running, it may be due to invalid configuration. Check kubelet logs with the talosctl logs command:

$ talosctl -n <IP> logs kubelet
172.20.0.2: I0305 20:45:07.756948    2334 controller.go:101] kubelet config controller: starting controller
172.20.0.2: I0305 20:45:07.756995    2334 controller.go:267] kubelet config controller: ensuring filesystem is set up correctly
172.20.0.2: I0305 20:45:07.757000    2334 fsstore.go:59] kubelet config controller: initializing config checkpoints directory "/etc/kubernetes/kubelet/store"

etcd is not running

By far the most likely cause of etcd not running is because the cluster has not yet been bootstrapped or because bootstrapping is currently in progress. The talosctl bootstrap command must be run manually and only once per cluster, and this step is commonly missed. Once a node is bootstrapped, it will start etcd and, over the course of a minute or two (depending on the download speed of the control plane nodes), the other control plane nodes should discover it and join themselves to the cluster.

Also, etcd will only run on control plane nodes. If a node is designated as a worker node, you should not expect etcd to be running on it.

When node boots for the first time, the etcd data directory (/var/lib/etcd) is empty, and it will only be populated when etcd is launched.

If etcd is not running, check service etcd state:

$ talosctl -n <IP> service etcd
NODE     172.20.0.2
ID       etcd
STATE    Running
HEALTH   OK
EVENTS   [Running]: Health check successful (3m21s ago)
         [Running]: Started task etcd (PID 2343) for container etcd (3m26s ago)
         [Preparing]: Creating service runner (3m26s ago)
         [Preparing]: Running pre state (3m26s ago)
         [Waiting]: Waiting for service "cri" to be "up", service "networkd" to be "up", service "timed" to be "up" (3m26s ago)

If service is stuck in Preparing state for bootstrap node, it might be related to slow network - at this stage Talos pulls the etcd image from the container registry.

If the etcd service is crashing and restarting, check its logs with talosctl -n <IP> logs etcd. The most common reasons for crashes are:

  • wrong arguments passed via extraArgs in the configuration;
  • booting Talos on non-empty disk with previous Talos installation, /var/lib/etcd contains data from old cluster.

etcd is not running on non-bootstrap control plane node

The etcd service on control plane nodes which were not the target of the cluster bootstrap will wait until the bootstrapped control plane node has completed. The bootstrap and discovery processes may take a few minutes to complete. As soon as the bootstrapped node starts its Kubernetes control plane components, kubectl get endpoints will return the IP of bootstrapped control plane node. At this point, the other control plane nodes will start their etcd services, join the cluster, and then start their own Kubernetes control plane components.

Kubernetes static pod definitions are not generated

Talos should write the static pod definitions for the Kubernetes control plane in /etc/kubernetes/manifests:

$ talosctl -n <IP> ls /etc/kubernetes/manifests
NODE         NAME
172.20.0.2   .
172.20.0.2   talos-kube-apiserver.yaml
172.20.0.2   talos-kube-controller-manager.yaml
172.20.0.2   talos-kube-scheduler.yaml

If the static pod definitions are not rendered, check etcd and kubelet service health (see above) and the controller runtime logs (talosctl logs controller-runtime).

Talos prints error an error on the server ("") has prevented the request from succeeding

This is expected during initial cluster bootstrap and sometimes after a reboot:

[   70.093289] [talos] task labelNodeAsMaster (1/1): starting
[   80.094038] [talos] retrying error: an error on the server ("") has prevented the request from succeeding (get nodes talos-default-master-1)

Initially kube-apiserver component is not running yet, and it takes some time before it becomes fully up during bootstrap (image should be pulled from the Internet, etc.) Once the control plane endpoint is up, Talos should continue with its boot process.

If Talos doesn’t proceed, it may be due to a configuration issue.

In any case, the status of the control plane components on each control plane nodes can be checked with talosctl containers -k:

$ talosctl -n <IP> containers --kubernetes
NODE         NAMESPACE   ID                                                                                      IMAGE                                        PID    STATUS
172.20.0.2   k8s.io      kube-system/kube-apiserver-talos-default-master-1                                       k8s.gcr.io/pause:3.2                         2539   SANDBOX_READY
172.20.0.2   k8s.io      └─ kube-system/kube-apiserver-talos-default-master-1:kube-apiserver                     k8s.gcr.io/kube-apiserver:v1.24.3            2572   CONTAINER_RUNNING

If kube-apiserver shows as CONTAINER_EXITED, it might have exited due to configuration error. Logs can be checked with taloctl logs --kubernetes (or with -k as a shorthand):

$ talosctl -n <IP> logs -k kube-system/kube-apiserver-talos-default-master-1:kube-apiserver
172.20.0.2: 2021-03-05T20:46:13.133902064Z stderr F 2021/03/05 20:46:13 Running command:
172.20.0.2: 2021-03-05T20:46:13.133933824Z stderr F Command env: (log-file=, also-stdout=false, redirect-stderr=true)
172.20.0.2: 2021-03-05T20:46:13.133938524Z stderr F Run from directory:
172.20.0.2: 2021-03-05T20:46:13.13394154Z stderr F Executable path: /usr/local/bin/kube-apiserver
...

Talos prints error nodes "talos-default-master-1" not found

This error means that kube-apiserver is up and the control plane endpoint is healthy, but the kubelet hasn’t received its client certificate yet, and it wasn’t able to register itself to Kubernetes. The Kubernetes controller manager (kube-controller-manager)is responsible for monitoring the certificate signing requests (CSRs) and issuing certificates for each of them. The kubelet is responsible for generating and submitting the CSRs for its associated node.

For the kubelet to get its client certificate, then, the Kubernetes control plane must be healthy:

  • the API server is running and available at the Kubernetes control plane endpoint URL
  • the controller manager is running and a leader has been elected

The states of any CSRs can be checked with kubectl get csr:

$ kubectl get csr
NAME        AGE   SIGNERNAME                                    REQUESTOR                 CONDITION
csr-jcn9j   14m   kubernetes.io/kube-apiserver-client-kubelet   system:bootstrap:q9pyzr   Approved,Issued
csr-p6b9q   14m   kubernetes.io/kube-apiserver-client-kubelet   system:bootstrap:q9pyzr   Approved,Issued
csr-sw6rm   14m   kubernetes.io/kube-apiserver-client-kubelet   system:bootstrap:q9pyzr   Approved,Issued
csr-vlghg   14m   kubernetes.io/kube-apiserver-client-kubelet   system:bootstrap:q9pyzr   Approved,Issued

Talos prints error node not ready

A Node in Kubernetes is marked as Ready only once its CNI is up. It takes a minute or two for the CNI images to be pulled and for the CNI to start. If the node is stuck in this state for too long, check CNI pods and logs with kubectl. Usually, CNI-related resources are created in kube-system namespace.

For example, for Talos default Flannel CNI:

$ kubectl -n kube-system get pods
NAME                                             READY   STATUS    RESTARTS   AGE
...
kube-flannel-25drx                               1/1     Running   0          23m
kube-flannel-8lmb6                               1/1     Running   0          23m
kube-flannel-gl7nx                               1/1     Running   0          23m
kube-flannel-jknt9                               1/1     Running   0          23m
...

Talos prints error x509: certificate signed by unknown authority

The full error might look like:

x509: certificate signed by unknown authority (possiby because of crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"

Usually, this occurs because the control plane endpoint points to a different cluster than the client certificate was generated for. If a node was recycled between clusters, make sure it was properly wiped between uses. If a client has multiple client configurations, make sure you are matching the correct talosconfig with the correct cluster.

etcd is running on bootstrap node, but stuck in pre state on non-bootstrap nodes

Please see question etcd is not running on non-bootstrap control plane node.

Checking kube-controller-manager and kube-scheduler

If the control plane endpoint is up, the status of the pods can be ascertained with kubectl:

$ kubectl get pods -n kube-system -l k8s-app=kube-controller-manager
NAME                                             READY   STATUS    RESTARTS   AGE
kube-controller-manager-talos-default-master-1   1/1     Running   0          28m
kube-controller-manager-talos-default-master-2   1/1     Running   0          28m
kube-controller-manager-talos-default-master-3   1/1     Running   0          28m

If the control plane endpoint is not yet up, the container status of the control plane components can be queried with talosctl containers --kubernetes:

$ talosctl -n <IP> c -k
NODE         NAMESPACE   ID                                                                                      IMAGE                                        PID    STATUS
...
172.20.0.2   k8s.io      kube-system/kube-controller-manager-talos-default-master-1                              k8s.gcr.io/pause:3.2                         2547   SANDBOX_READY
172.20.0.2   k8s.io      └─ kube-system/kube-controller-manager-talos-default-master-1:kube-controller-manager   k8s.gcr.io/kube-controller-manager:v1.24.3   2580   CONTAINER_RUNNING
172.20.0.2   k8s.io      kube-system/kube-scheduler-talos-default-master-1                                       k8s.gcr.io/pause:3.2                         2638   SANDBOX_READY
172.20.0.2   k8s.io      └─ kube-system/kube-scheduler-talos-default-master-1:kube-scheduler                     k8s.gcr.io/kube-scheduler:v1.24.3            2670   CONTAINER_RUNNING
...

If some of the containers are not running, it could be that image is still being pulled. Otherwise the process might crashing. The logs can be checked with talosctl logs --kubernetes <containerID>:

$ talosctl -n <IP> logs -k kube-system/kube-controller-manager-talos-default-master-1:kube-controller-manager
172.20.0.3: 2021-03-09T13:59:34.291667526Z stderr F 2021/03/09 13:59:34 Running command:
172.20.0.3: 2021-03-09T13:59:34.291702262Z stderr F Command env: (log-file=, also-stdout=false, redirect-stderr=true)
172.20.0.3: 2021-03-09T13:59:34.291707121Z stderr F Run from directory:
172.20.0.3: 2021-03-09T13:59:34.291710908Z stderr F Executable path: /usr/local/bin/kube-controller-manager
172.20.0.3: 2021-03-09T13:59:34.291719163Z stderr F Args (comma-delimited): /usr/local/bin/kube-controller-manager,--allocate-node-cidrs=true,--cloud-provider=,--cluster-cidr=10.244.0.0/16,--service-cluster-ip-range=10.96.0.0/12,--cluster-signing-cert-file=/system/secrets/kubernetes/kube-controller-manager/ca.crt,--cluster-signing-key-file=/system/secrets/kubernetes/kube-controller-manager/ca.key,--configure-cloud-routes=false,--kubeconfig=/system/secrets/kubernetes/kube-controller-manager/kubeconfig,--leader-elect=true,--root-ca-file=/system/secrets/kubernetes/kube-controller-manager/ca.crt,--service-account-private-key-file=/system/secrets/kubernetes/kube-controller-manager/service-account.key,--profiling=false
172.20.0.3: 2021-03-09T13:59:34.293870359Z stderr F 2021/03/09 13:59:34 Now listening for interrupts
172.20.0.3: 2021-03-09T13:59:34.761113762Z stdout F I0309 13:59:34.760982      10 serving.go:331] Generated self-signed cert in-memory
...

Checking controller runtime logs

Talos runs a set of controllers which operate on resources to build and support the Kubernetes control plane.

Some debugging information can be queried from the controller logs with talosctl logs controller-runtime:

$ talosctl -n <IP> logs controller-runtime
172.20.0.2: 2021/03/09 13:57:11  secrets.EtcdController: controller starting
172.20.0.2: 2021/03/09 13:57:11  config.MachineTypeController: controller starting
172.20.0.2: 2021/03/09 13:57:11  k8s.ManifestApplyController: controller starting
172.20.0.2: 2021/03/09 13:57:11  v1alpha1.BootstrapStatusController: controller starting
172.20.0.2: 2021/03/09 13:57:11  v1alpha1.TimeStatusController: controller starting
...

Controllers continuously run a reconcile loop, so at any time, they may be starting, failing, or restarting. This is expected behavior.

Things to look for:

v1alpha1.BootstrapStatusController: bootkube initialized status not found: control plane is not self-hosted, running with static pods.

k8s.KubeletStaticPodController: writing static pod "/etc/kubernetes/manifests/talos-kube-apiserver.yaml": static pod definitions were rendered successfully.

k8s.ManifestApplyController: controller failed: error creating mapping for object /v1/Secret/bootstrap-token-q9pyzr: an error on the server ("") has prevented the request from succeeding: control plane endpoint is not up yet, bootstrap manifests can’t be injected, controller is going to retry.

k8s.KubeletStaticPodController: controller failed: error refreshing pod status: error fetching pod status: an error on the server ("Authorization error (user=apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)") has prevented the request from succeeding: kubelet hasn’t been able to contact kube-apiserver yet to push pod status, controller is going to retry.

k8s.ManifestApplyController: created rbac.authorization.k8s.io/v1/ClusterRole/psp:privileged: one of the bootstrap manifests got successfully applied.

secrets.KubernetesController: controller failed: missing cluster.aggregatorCA secret: Talos is running with 0.8 configuration, if the cluster was upgraded from 0.8, this is expected, and conversion process will fix machine config automatically. If this cluster was bootstrapped with version 0.9, machine configuration should be regenerated with 0.9 talosctl.

If there are no new messages in the controller-runtime log, it means that the controllers have successfully finished reconciling, and that the current system state is the desired system state.

Checking static pod definitions

Talos generates static pod definitions for the kube-apiserver, kube-controller-manager, and kube-scheduler components based on its machine configuration. These definitions can be checked as resources with talosctl get staticpods:

$ talosctl -n <IP> get staticpods -o yaml
get staticpods -o yaml
node: 172.20.0.2
metadata:
    namespace: controlplane
    type: StaticPods.kubernetes.talos.dev
    id: kube-apiserver
    version: 2
    phase: running
    finalizers:
        - k8s.StaticPodStatus("kube-apiserver")
spec:
    apiVersion: v1
    kind: Pod
    metadata:
        annotations:
            talos.dev/config-version: "1"
            talos.dev/secrets-version: "1"
        creationTimestamp: null
        labels:
            k8s-app: kube-apiserver
            tier: control-plane
        name: kube-apiserver
        namespace: kube-system
...

The status of the static pods can queried with talosctl get staticpodstatus:

$ talosctl -n <IP> get staticpodstatus
NODE         NAMESPACE      TYPE              ID                                                           VERSION   READY
172.20.0.2   controlplane   StaticPodStatus   kube-system/kube-apiserver-talos-default-master-1            1         True
172.20.0.2   controlplane   StaticPodStatus   kube-system/kube-controller-manager-talos-default-master-1   1         True
172.20.0.2   controlplane   StaticPodStatus   kube-system/kube-scheduler-talos-default-master-1            1         True

The most important status field is READY, which is the last column printed. The complete status can be fetched by adding -o yaml flag.

Checking bootstrap manifests

As part of the bootstrap process, Talos injects bootstrap manifests into Kubernetes API server. There are two kinds of these manifests: system manifests built-in into Talos and extra manifests downloaded (custom CNI, extra manifests in the machine config):

$ talosctl -n <IP> get manifests
NODE         NAMESPACE      TYPE       ID                               VERSION
172.20.0.2   controlplane   Manifest   00-kubelet-bootstrapping-token   1
172.20.0.2   controlplane   Manifest   01-csr-approver-role-binding     1
172.20.0.2   controlplane   Manifest   01-csr-node-bootstrap            1
172.20.0.2   controlplane   Manifest   01-csr-renewal-role-binding      1
172.20.0.2   controlplane   Manifest   02-kube-system-sa-role-binding   1
172.20.0.2   controlplane   Manifest   03-default-pod-security-policy   1
172.20.0.2   controlplane   Manifest   05-https://docs.projectcalico.org/manifests/calico.yaml   1
172.20.0.2   controlplane   Manifest   10-kube-proxy                    1
172.20.0.2   controlplane   Manifest   11-core-dns                      1
172.20.0.2   controlplane   Manifest   11-core-dns-svc                  1
172.20.0.2   controlplane   Manifest   11-kube-config-in-cluster        1

Details of each manifest can be queried by adding -o yaml:

$ talosctl -n <IP> get manifests 01-csr-approver-role-binding --namespace=controlplane -o yaml
node: 172.20.0.2
metadata:
    namespace: controlplane
    type: Manifests.kubernetes.talos.dev
    id: 01-csr-approver-role-binding
    version: 1
    phase: running
spec:
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRoleBinding
      metadata:
        name: system-bootstrap-approve-node-client-csr
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
      subjects:
        - apiGroup: rbac.authorization.k8s.io
          kind: Group
          name: system:bootstrappers

Worker node is stuck with apid health check failures

Control plane nodes have enough secret material to generate apid server certificates, but worker nodes depend on control plane trustd services to generate certificates. Worker nodes wait for their kubelet to join the cluster. Then the Talos apid queries the Kubernetes endpoints via control plane endpoint to find trustd endpoints. They then use trustd to request and receive their certificate.

So if apid health checks are failing on worker node:

  • make sure control plane endpoint is healthy
  • check that worker node kubelet joined the cluster