1 - Advanced Networking

Static Addressing

Static addressing is comprised of specifying cidr, routes ( remember to add your default gateway ), and interface. Most likely you’ll also want to define the nameservers so you have properly functioning DNS.

machine:
  network:
    hostname: talos
    nameservers:
      - 10.0.0.1
    time:
      servers:
        - time.cloudflare.com
    interfaces:
      - interface: eth0
        cidr: 10.0.0.201/8
        mtu: 8765
        routes:
          - network: 0.0.0.0/0
            gateway: 10.0.0.1
      - interface: eth1
        ignore: true

Additional Addresses for an Interface

In some environments you may need to set additional addresses on an interface. In the following example, we set two additional addresses on the loopback interface.

machine:
  network:
    interfaces:
      - interface: lo0
        cidr: 192.168.0.21/24
      - interface: lo0
        cidr: 10.2.2.2/24

Bonding

The following example shows how to create a bonded interface.

machine:
  network:
    interfaces:
      - interface: bond0
        dhcp: true
        bond:
          mode: 802.3ad
          lacpRate: fast
          xmitHashPolicy: layer3+4
          miimon: 100
          updelay: 200
          downdelay: 200
          interfaces:
            - eth0
            - eth1

VLANs

To setup vlans on a specific device use an array of VLANs to add. The master device may be configured without addressing by setting dhcp to false.

machine:
  network:
    interfaces:
      - interface: eth0
        dhcp: false
        vlans:
          - vlanId: 100
            cidr: "192.168.2.10/28"
            routes:
              - network: 0.0.0.0/0
                gateway: 192.168.2.1

2 - Configuring Containerd

The base containerd configuration expects to merge in any additional configs present in /var/cri/conf.d/*.toml.

An example of exposing metrics

Into each machine config, add the following:

machine:
  ...
  files:
    - content: |
        [metrics]
          address = "0.0.0.0:11234"        
      path: /var/cri/conf.d/metrics.toml
      op: create

Create cluster like normal and see that metrics are now present on this port:

$ curl 127.0.0.1:11234/v1/metrics
# HELP container_blkio_io_service_bytes_recursive_bytes The blkio io service bytes recursive
# TYPE container_blkio_io_service_bytes_recursive_bytes gauge
container_blkio_io_service_bytes_recursive_bytes{container_id="0677d73196f5f4be1d408aab1c4125cf9e6c458a4bea39e590ac779709ffbe14",device="/dev/dm-0",major="253",minor="0",namespace="k8s.io",op="Async"} 0
container_blkio_io_service_bytes_recursive_bytes{container_id="0677d73196f5f4be1d408aab1c4125cf9e6c458a4bea39e590ac779709ffbe14",device="/dev/dm-0",major="253",minor="0",namespace="k8s.io",op="Discard"} 0
...
...

3 - Configuring Corporate Proxies

Appending the Certificate Authority of MITM Proxies

Put into each machine the PEM encoded certificate:

machine:
  ...
  files:
    - content: |
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----        
      permissions: 0644
      path: /etc/ssl/certs/ca-certificates
      op: append

Configuring a Machine to Use the Proxy

To make use of a proxy:

machine:
  env:
    http_proxy: <http proxy>
    https_proxy: <https proxy>
    no_proxy: <no proxy>

Additionally, configure the DNS nameservers, and NTP servers:

machine:
  env:
  ...
  time:
    servers:
      - <server 1>
      - <server ...>
      - <server n>
  ...
  network:
    nameservers:
      - <ip 1>
      - <ip ...>
      - <ip n>

4 - Configuring Pull Through Cache

In this guide we will create a set of local caching Docker registry proxies to minimize local cluster startup time.

When running Talos locally, pulling images from Docker registries might take a significant amount of time. We spin up local caching pass-through registries to cache images and configure a local Talos cluster to use those proxies. A similar approach might be used to run Talos in production in air-gapped environments. It can be also used to verify that all the images are available in local registries.

Video Walkthrough

To see a live demo of this writeup, see the video below:

Requirements

The follow are requirements for creating the set of caching proxies:

  • Docker 18.03 or greater
  • Local cluster requirements for either docker or fireracker.

Launch the Caching Docker Registry Proxies

Talos pulls from docker.io, k8s.gcr.io, gcr.io and quay.io by default. If your configuration is different, you might need to modify the commands below:

docker run -d -p 5000:5000 \
    -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \
    --restart always \
    --name registry-docker.io registry:2

docker run -d -p 5001:5000 \
    -e REGISTRY_PROXY_REMOTEURL=https://k8s.gcr.io \
    --restart always \
    --name registry-k8s.gcr.io registry:2

docker run -d -p 5002:5000 \
    -e REGISTRY_PROXY_REMOTEURL=https://quay.io \
    --restart always \
    --name registry-quay.io registry:2.5

docker run -d -p 5003:5000 \
    -e REGISTRY_PROXY_REMOTEURL=https://gcr.io \
    --restart always \
    --name registry-gcr.io registry:2

Note: Proxies are started as docker containers, and they’re automatically configured to start with Docker daemon. Please note that quay.io proxy doesn’t support recent Docker image schema, so we run older registry image version (2.5).

As a registry container can only handle a single upstream Docker registry, we launch a container per upstream, each on its own host port (5000, 5001, 5002).

Using Caching Registries with firecracker Local Cluster

With a firecracker local cluster, a bridge interface is created on the host. As registry containers expose their ports on the host, we can use bridge IP to direct proxy requests.

sudo talosctl cluster create --provisioner firecracker \
    --registry-mirror docker.io=http://10.5.0.1:5000 \
    --registry-mirror k8s.gcr.io=http://10.5.0.1:5001 \
    --registry-mirror quay.io=http://10.5.0.1:5002 \
    --registry-mirror gcr.io=http://10.5.0.1:5003

The Talos local cluster should now start pulling via caching registries. This can be verified via registry logs, e.g. docker logs -f registry-docker.io. The first time cluster boots, images are pulled and cached, so next cluster boot should be much faster.

Note: 10.5.0.1 is a bridge IP with default network (10.5.0.0/24), if using custom --cidr, value should be adjusted accordingly.

Using Caching Registries with docker Local Cluster

With a docker local cluster we can use docker bridge IP, default value for that IP is 172.17.0.1. On Linux, the docker bridge address can be inspected with ip addr show docker0.

talosctl cluster create --provisioner docker \
    --registry-mirror docker.io=http://172.17.0.1:5000 \
    --registry-mirror k8s.gcr.io=http://172.17.0.1:5001 \
    --registry-mirror quay.io=http://172.17.0.1:5002 \
    --registry-mirror gcr.io=http://172.17.0.1:5003

Cleaning Up

To cleanup, run:

docker rm -f registry-docker.io
docker rm -f registry-k8s.gcr.io
docker rm -f registry-quay.io
docker rm -f registry-gcr.io

Note: Removing docker registry containers also removes the image cache. So if you plan to use caching registries, keep the containers running.

5 - Configuring the Cluster Endpoint

In this section, we will step through the configuration of a Talos based Kubernetes cluster. There are three major components we will configure:

  • apid and talosctl
  • the master nodes
  • the worker nodes

Talos enforces a high level of security by using mutual TLS for authentication and authorization.

We recommend that the configuration of Talos be performed by a cluster owner. A cluster owner should be a person of authority within an organization, perhaps a director, manager, or senior member of a team. They are responsible for storing the root CA, and distributing the PKI for authorized cluster administrators.

Talos runs great out of the box, but if you tweak some minor settings it will make your life a lot easier in the future. This is not a requirement, but rather a document to explain some key settings.

Endpoint

To configure the talosctl endpoint, it is recommended you use a resolvable DNS name. This way, if you decide to upgrade to a multi-controlplane cluster you only have to add the ip adres to the hostname configuration. The configuration can either be done on a Loadbalancer, or simply trough DNS.

For example:

This is in the config file for the cluster e.g. init.yaml, controlplane.yaml and join.yaml. for more details, please see: v1alpha1 endpoint configuration

.....
cluster:
  controlPlane:
    endpoint: https://endpoint.example.local:6443
.....

If you have a DNS name as the endpoint, you can upgrade your talos cluster with multiple controlplanes in the future (if you don’t have a multi-controlplane setup from the start) Using a DNS name generates the corresponding Certificates (Kubernetes and Talos) for the correct hostname.

6 - Customizing the Kernel

FROM scratch AS customization
COPY --from=<custom kernel image> /lib/modules /lib/modules

FROM docker.io/andrewrynhard/installer:latest
COPY --from=<custom kernel image> /boot/vmlinuz /usr/install/vmlinuz
docker build --build-arg RM="/lib/modules" -t talos-installer .

Note: You can use the --squash flag to create smaller images.

Now that we have a custom installer we can build Talos for the specific platform we wish to deploy to.

7 - Customizing the Root Filesystem

The installer image contains ONBUILD instructions that handle the following:

  • the decompression, and unpacking of the initramfs.xz
  • the unsquashing of the rootfs
  • the copying of new rootfs files
  • the squashing of the new rootfs
  • and the packing, and compression of the new initramfs.xz

When used as a base image, the installer will perform the above steps automatically with the requirement that a customization stage be defined in the Dockerfile.

For example, say we have an image that contains the contents of a library we wish to add to the Talos rootfs. We need to define a stage with the name customization:

FROM scratch AS customization
COPY --from=<name|index> <src> <dest>

Using a multi-stage Dockerfile we can define the customization stage and build FROM the installer image:

FROM scratch AS customization
COPY --from=<name|index> <src> <dest>

FROM docker.io/autonomy/installer:latest

When building the image, the customization stage will automatically be copied into the rootfs. The customization stage is not limited to a single COPY instruction. In fact, you can do whatever you would like in this stage, but keep in mind that everything in / will be copied into the rootfs.

Note: <dest> is the path relative to the rootfs that you wish to place the contents of <src>.

To build the image, run:

docker build --squash -t <organization>/installer:latest .

In the case that you need to perform some cleanup before adding additional files to the rootfs, you can specify the RM build-time variable:

docker build --squash --build-arg RM="[<path> ...]" -t <organization>/installer:latest .

This will perform a rm -rf on the specified paths relative to the rootfs.

Note: RM must be a whitespace delimited list.

The resulting image can be used to:

  • generate an image for any of the supported providers
  • perform bare-metall installs
  • perform upgrades

We will step through common customizations in the remainder of this section.

8 - Managing PKI

Generating an Administrator Key Pair

In order to create a key pair, you will need the root CA.

Save the CA public key, and CA private key as ca.crt, and ca.key respectively. Now, run the following commands to generate a certificate:

talosctl gen key --name admin
talosctl gen csr --key admin.key --ip 127.0.0.1
talosctl gen crt --ca ca --csr admin.csr --name admin

Now, base64 encode admin.crt, and admin.key:

cat admin.crt | base64
cat admin.key | base64

You can now set the crt and key fields in the talosconfig to the base64 encoded strings.

Renewing an Expired Administrator Certificate

In order to renew the certificate, you will need the root CA, and the admin private key. The base64 encoded key can be found in any one of the control plane node’s configuration file. Where it is exactly will depend on the specific version of the configuration file you are using.

Save the CA public key, CA private key, and admin private key as ca.crt, ca.key, and admin.key respectively. Now, run the following commands to generate a certificate:

talosctl gen csr --key admin.key --ip 127.0.0.1
talosctl gen crt --ca ca --csr admin.csr --name admin

You should see admin.crt in your current directory. Now, base64 encode admin.crt:

cat admin.crt | base64

You can now set the certificate in the talosconfig to the base64 encoded string.

9 - Resetting a Machine

From time to time, it may be beneficial to reset a Talos machine to its “original” state. Bear in mind that this is a destructive action for the given machine. Doing this means removing the machine from Kubernetes, Etcd (if applicable), and clears any data on the machine that would normally persist a reboot.

The API command for doing this is talosctl reset. There are a couple of flags as part of this command:

Flags:
      --graceful   if true, attempt to cordon/drain node and leave etcd (if applicable) (default true)
      --reboot     if true, reboot the node after resetting instead of shutting down

The graceful flag is especially important when considering HA vs. non-HA Talos clusters. If the machine is part of an HA cluster, a normal, graceful reset should work just fine right out of the box as long as the cluster is in a good state. However, if this is a single node cluster being used for testing purposes, a graceful reset is not an option since Etcd cannot be “left” if there is only a single member. In this case, reset should be used with --graceful=false to skip performing checks that would normally block the reset.

10 - Upgrading Kubernetes

Video Walkthrough

To see a live demo of this writeup, see the video below:

Kubelet Image

In Kubernetes 1.19, the official hyperkube image was removed. This means that in order to upgrade Kubernetes, Talos users will have to change the command, and image fields of each control plane component. The kubelet image will also have to be updated, if you wish to specify the kubelet image explicitly. The default used by Talos is sufficient in most cases.

Kubeconfig

In order to edit the control plane, we will need a working kubectl config. If you don’t already have one, you can get one by running:

talosctl --nodes <master node> kubeconfig

Automated Kubernetes Upgrade

In Talos v0.6.1 we introduced the upgrade-k8s command in talosctl. This command can be used to automate the Kubernetes upgrade process. For example, to upgrade from Kubernetes v1.18.6 to v1.19.0 run:

$ talosctl --nodes <master node> upgrade-k8s --from 1.18.6 --to 1.19.0
updating pod-checkpointer grace period to "0m"
sleeping 5m0s to let the pod-checkpointer self-checkpoint be updated
temporarily taking "kube-apiserver" out of pod-checkpointer control
updating daemonset "kube-apiserver" to version "1.19.0"
updating daemonset "kube-controller-manager" to version "1.19.0"
updating daemonset "kube-scheduler" to version "1.19.0"
updating daemonset "kube-proxy" to version "1.19.0"
updating pod-checkpointer grace period to "5m0s"

Manual Kubernetes Upgrade

Kubernetes can be upgraded manually as well by following the steps outlined below. They are equivalent to the steps performed by the talosctl upgrade-k8s command.

pod-checkpointer

Talos runs pod-checkpointer component which helps to recover control plane components (specifically, API server) if control plane is not healthy.

However, the way checkpoints interact with API server upgrade may make an upgrade take a lot longer due to a race condition on API server listen port.

In order to speed up upgrades, first lower pod-checkpointer grace period to zero (kubectl -n kube-system edit daemonset pod-checkpointer), change:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: pod-checkpointer
        command:
        ...
        - --checkpoint-grace-period=5m0s

to:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: pod-checkpointer
        command:
        ...
        - --checkpoint-grace-period=0s

Wait for 5 minutes to let pod-checkpointer update self-checkpoint to the new grace period.

API Server

In the API server’s DaemonSet, change:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
        - name: kube-apiserver
          image: ...
          command:
            - ./hyperkube
            - kube-apiserver

to:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
        - name: kube-apiserver
          image: k8s.gcr.io/kube-apiserver:v1.19.0
          command:
            - /go-runner
            - /usr/local/bin/kube-apiserver

To edit the DaemonSet, run:

kubectl edit daemonsets -n kube-system kube-apiserver

Controller Manager

In the controller manager’s DaemonSet, change:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
        - name: kube-controller-manager
          image: ...
          command:
            - ./hyperkube
            - kube-controller-manager

to:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
        - name: kube-controller-manager
          image: k8s.gcr.io/kube-controller-manager:v1.19.0
          command:
            - /go-runner
            - /usr/local/bin/kube-controller-manager

To edit the DaemonSet, run:

kubectl edit daemonsets -n kube-system kube-controller-manager

Scheduler

In the scheduler’s DaemonSet, change:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
        - name: kube-scheduler
          image: ...
          command:
            - ./hyperkube
            - kube-scheduler

to:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
        - name: kube-sceduler
          image: k8s.gcr.io/kube-scheduler:v1.19.0
          command:
            - /go-runner
            - /usr/local/bin/kube-scheduler

To edit the DaemonSet, run:

kubectl edit daemonsets -n kube-system kube-scheduler

Restoring pod-checkpointer

Restore grace period of 5 minutes (kubectl -n kube-system edit daemonset pod-checkpointer), change:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: pod-checkpointer
        command:
        ...
        - --checkpoint-grace-period=0s

to:

kind: DaemonSet
...
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: pod-checkpointer
        command:
        ...
        - --checkpoint-grace-period=5m0s

Kubelet

The Talos team now maintains an image for the kubelet that should be used starting with Kubernetes 1.19. The image for this release is docker.io/autonomy/kubelet:v1.19.0. To explicitly set the image, we can use the official documentation. For example:

machine:
  ...
  kubelet:
    image: docker.io/autonomy/kubelet:v1.19.0

11 - Upgrading Talos

Video Walkthrough

To see a live demo of this writeup, see the video below:

Talos

In an effort to create more production ready clusters, Talos will now taint control plane nodes as unschedulable. This means that any application you might have deployed must tolerate this taint if you intend on running the application on control plane nodes.

Another feature you will notice is the automatic uncordoning of nodes that have been upgraded. Talos will now uncordon a node if the cordon was initiated by the upgrade process.

Talosctl

The talosctl CLI now requires an explicit set of nodes. This can be configured with talos config nodes or set on the fly with talos --nodes.