Converting Control Plane

Talos version 0.9 runs Kubernetes control plane in a new way: static pods managed by Talos. Talos version 0.8 and below runs self-hosted control plane. After Talos OS upgrade to version 0.9 Kubernetes control plane should be converted to run as static pods.

This guide describes automated conversion script and also shows detailed manual conversion process.

Video Walkthrough

To see a live demo of this writeup, see the video below:

Automated Conversion

First, make sure all nodes are updated to Talos 0.9:

$ kubectl get nodes -o wide
NAME                     STATUS   ROLES                  AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION   CONTAINER-RUNTIME
talos-default-master-1   Ready    control-plane,master   58m   v1.20.4   172.20.0.2    <none>        Talos (v0.9.0)   5.10.19-talos    containerd://1.4.4
talos-default-master-2   Ready    control-plane,master   58m   v1.20.4   172.20.0.3    <none>        Talos (v0.9.0)   5.10.19-talos    containerd://1.4.4
talos-default-master-3   Ready    control-plane,master   58m   v1.20.4   172.20.0.4    <none>        Talos (v0.9.0)   5.10.19-talos    containerd://1.4.4
talos-default-worker-1   Ready    <none>                 58m   v1.20.4   172.20.0.5    <none>        Talos (v0.9.0)   5.10.19-talos    containerd://1.4.4

Start the conversion script:

$ talosctl -n <IP> convert-k8s
discovered master nodes ["172.20.0.2" "172.20.0.3" "172.20.0.4"]
current self-hosted status: true
gathering control plane configuration
aggregator CA key can't be recovered from bootkube-boostrapped control plane, generating new CA
patching master node "172.20.0.2" configuration
patching master node "172.20.0.3" configuration
patching master node "172.20.0.4" configuration
waiting for static pod definitions to be generated
waiting for manifests to be generated
Talos generated control plane static pod definitions and bootstrap manifests, please verify them with commands:
    talosctl -n <master node IP> get StaticPods.kubernetes.talos.dev
    talosctl -n <master node IP> get Manifests.kubernetes.talos.dev

in order to remove self-hosted control plane, pod-checkpointer component needs to be disabled
once pod-checkpointer is disabled, the cluster shouldn't be rebooted until the entire conversion process is complete
confirm disabling pod-checkpointer to proceed with control plane update [yes/no]:

Script stops at this point waiting for confirmation. Talos still runs self-hosted control plane, and static pods were not rendered yet.

As instructed by the script, please verify that static pod definitions are correct:

$ talosctl -n <IP> get staticpods -o yaml
node: 172.20.0.2
metadata:
    namespace: controlplane
    type: StaticPods.kubernetes.talos.dev
    id: kube-apiserver
    version: 1
    phase: running
spec:
    apiVersion: v1
    kind: Pod
    metadata:
        annotations:
            talos.dev/config-version: "2"
            talos.dev/secrets-version: "1"
        creationTimestamp: null
        labels:
            k8s-app: kube-apiserver
            tier: control-plane
        name: kube-apiserver
        namespace: kube-system
    spec:
        containers:
            - command:
...

Static pod definitions are generated from the machine configuration and should match pod template as generated by Talos on bootstrap of self-hosted control plane unless there were some manual changes applied to the daemonset specs after bootstrap. Talos patches the machine configuration with the container image versions scraped from the daemonset definition, fetches the service account key from Kubernetes secrets.

Aggregator CA can't be recovered from the self-hosted control plane, so new CA gets generated. This is generally harmless and not visible from outside the cluster. The Aggregator CA is not the same CA as is used by Talos or Kubernetes standard API. It is a special PKI used for aggregating API extension services inside your cluster. If you have non-standard apiserver aggregations (fairly rare, and you should know if you do), then you may need to restart these services after the new CA is in place.

Verify that bootstrap manifests are correct:

$ talosctl -n <IP> get manifests --namespace controlplane
NODE         NAMESPACE      TYPE       ID                               VERSION
172.20.0.2   controlplane   Manifest   00-kubelet-bootstrapping-token   1
172.20.0.2   controlplane   Manifest   01-csr-approver-role-binding     1
172.20.0.2   controlplane   Manifest   01-csr-node-bootstrap            1
172.20.0.2   controlplane   Manifest   01-csr-renewal-role-binding      1
172.20.0.2   controlplane   Manifest   02-kube-system-sa-role-binding   1
172.20.0.2   controlplane   Manifest   03-default-pod-security-policy   1
172.20.0.2   controlplane   Manifest   10-kube-proxy                    1
172.20.0.2   controlplane   Manifest   11-core-dns                      1
172.20.0.2   controlplane   Manifest   11-core-dns-svc                  1
172.20.0.2   controlplane   Manifest   11-kube-config-in-cluster        1
$ talosctl -n <IP> get manifests --namespace=extras
NODE         NAMESPACE   TYPE       ID                                                        VERSION
172.20.0.2   extras      Manifest   05-https://docs.projectcalico.org/manifests/calico.yaml   1

Make sure that manifests and static pods are correct across all control plane nodes, as each node reconciles control plane state on its own. For example, CNI configuration in machine config should be in sync across all the nodes. Talos nodes try to create any missing Kubernetes resources from the manifests, but it never updates or deletes existing resources.

If something looks wrong, script can be aborted and machine configuration should be updated to fix the problem. Once configuration is updated, the script can be restarted.

If static pod definitions and manifests look good, confirm next step to disable pod-checkpointer:

$ talosctl -n <IP> convert-k8s
...
confirm disabling pod-checkpointer to proceed with control plane update [yes/no]: yes
disabling pod-checkpointer
deleting daemonset "pod-checkpointer"
checking for active pod checkpoints
2021/03/09 23:37:25 retrying error: found 3 active pod checkpoints: [pod-checkpointer-655gc-talos-default-master-3 pod-checkpointer-pw6mv-talos-default-master-1 pod-checkpointer-zdw9z-talos-default-master-2]
2021/03/09 23:42:25 retrying error: found 1 active pod checkpoints: [pod-checkpointer-pw6mv-talos-default-master-1]
confirm applying static pod definitions and manifests [yes/no]:

Self-hosted control plane runs pod-checkpointer to work around issues with control plane availability. It should be disabled before conversion starts to allow self-hosted control plane to be removed. It takes around 5 minutes for the pod-checkpointer to be fully disabled. Script verifies that all checkpoints are removed before proceeding.

This last confirmation before proceeding is at the point when there is no way to keep running self-hosted control plane: static pods are released, bootstrap manifests are applied, self-hosted control plane is removed.

$ talosctl -n <IP> convert-k8s
...
confirm applying static pod definitions and manifests [yes/no]: yes
removing self-hosted initialized key
waiting for static pods for "kube-apiserver" to be present in the API server state
waiting for static pods for "kube-controller-manager" to be present in the API server state
waiting for static pods for "kube-scheduler" to be present in the API server state
deleting daemonset "kube-apiserver"
waiting for static pods for "kube-apiserver" to be present in the API server state
deleting daemonset "kube-controller-manager"
waiting for static pods for "kube-controller-manager" to be present in the API server state
deleting daemonset "kube-scheduler"
waiting for static pods for "kube-scheduler" to be present in the API server state
conversion process completed successfully

As soon as the control plane static pods are rendered, the kubelet starts the control plane static pods. It is expected that the pods for kube-apiserver will crash initially. Only one kube-apiserver can be bound to the host Node's port 6443 at a time. Eventually, the old kube-apiserver will be killed, and the new one will be able to start. This is all handled automatically. The script will continue by removing each self-hosted daemonset and verifying that static pods are ready and healthy.

Manual Conversion

Check that Talos runs self-hosted control plane:

$ talosctl -n <CONTROL_PLANE_IP> get bs
NODE         NAMESPACE   TYPE              ID              VERSION   SELF HOSTED
172.20.0.2   runtime     BootstrapStatus   control-plane   2         true

Talos machine configuration need to be updated to the 0.9 format; there are two new required machine configuration settings:

  • .cluster.serviceAccount is the service account PEM-encoded private key.
  • .cluster.aggregatorCA is the aggregator CA for kube-apiserver (certficiate and private key).

Current service account can be fetched from the Kubernetes secrets:

$ kubectl -n kube-system get secrets kube-controller-manager -o jsonpath='{.data.service\-account\.key}'
LS0tLS1CRUdJTiBSU0EgUFJJVkFURS...

All control plane node machine configurations should be patched with the service account key:

$ talosctl -n <CONTROL_PLANE_IP1>,<CONTROL_PLANE_IP2>,... patch mc --immediate -p '[{"op": "add", "path": "/cluster/serviceAccount", "value": {"key": "LS0tLS1CRUdJTiBSU0EgUFJJVkFURS..."}}]'
patched mc at the node 172.20.0.2

Aggregator CA can be generated using OpenSSL or any other certificate generation tools: RSA or ECDSA certificate with CN front-proxy valid for 10 years. PEM-encoded CA certificate and key should be base64-encoded and patched into the machine config at path /cluster/aggregatorCA:

$ talosctl -n <CONTROL_PLANE_IP1>,<CONTROL_PLANE_IP2>,... patch mc --immediate -p '[{"op": "add", "path": "/cluster/aggregatorCA", "value": {"crt": "S0tLS1CRUdJTiBDRVJUSUZJQ...", "key": "LS0tLS1CRUdJTiBFQy..."}}]'
patched mc at the node 172.20.0.2

At this point static pod definitions and bootstrap manifests should be rendered, please see "Automated Conversion" on how to verify generated objects. Feel free to continue to refine your machine configuration until the generated static pod definitions and bootstrap manifests look good.

If static pod definitions are not generated, check logs with talosctl -n <IP> logs controller-runtime.

Disable pod-checkpointer with:

$ kubectl -n kube-system delete ds pod-checkpointer
daemonset.apps "pod-checkpointer" deleted

Wait for all pod checkpoints to be removed:

$ kubectl -n kube-system get pods
NAME                                            READY   STATUS    RESTARTS   AGE
...
pod-checkpointer-8q2lh-talos-default-master-2   1/1     Running   0          3m34s
pod-checkpointer-nnm5w-talos-default-master-3   1/1     Running   0          3m24s
pod-checkpointer-qnmdt-talos-default-master-1   1/1     Running   0          2m21s

Pod checkpoints have annotation checkpointer.alpha.coreos.com/checkpoint-of.

Once all the pod checkpoints are removed (it takes 5 minutes for the checkpoints to be removed), proceed by removing self-hosted initialized key:

talosctl -n <CONTROL_PLANE_IP> convert-k8s --remove-initialized-key

Talos controllers will now render static pod definitions, and the kubelet will launch any resulting static pods.

Once static pods are visible in kubectl get pods -n kube-system output, proceed by removing each of the self-hosted daemonsets:

$ kubectl -n kube-system delete daemonset kube-apiserver
daemonset.apps "kube-apiserver" deleted

Make sure static pods for kube-apiserver got started successfully, pods are running and ready.

Proceed by deleting kube-controller-manager and kube-scheduler daemonsets, verifying that static pods are running between each step:

$ kubectl -n kube-system delete daemonset kube-controller-manager
daemonset.apps "kube-controller-manager" deleted
$ kubectl -n kube-system delete daemonset kube-scheduler
daemonset.apps "kube-scheduler" deleted