Pod Security

Enabling Pod Security Admission plugin to configure Pod Security Standards.

Kubernetes deprecated Pod Security Policy as of v1.21, and it is going to be removed in v1.25. Pod Security Policy was replaced with Pod Security Admission. Pod Security Admission is alpha in v1.22 (requires a feature gate) and beta in v1.23 (enabled by default).

In this guide we are going to enable and configure Pod Security Admission in Talos.

Configuration

Prepare the following machine configuration patch and store it in the pod-security-patch.yaml:

- op: add
  path: /cluster/apiServer/admissionControl
  value:
    - name: PodSecurity
      configuration:
        apiVersion: pod-security.admission.config.k8s.io/v1alpha1
        kind: PodSecurityConfiguration
        defaults:
            enforce: "baseline"
            enforce-version: "latest"
            audit: "restricted"
            audit-version: "latest"
            warn: "restricted"
            warn-version: "latest"
        exemptions:
            usernames: []
            runtimeClasses: []
            namespaces: [kube-system]

This is a cluster-wide configuration for the Pod Security Admission plugin:

  • by default baseline Pod Security Standard profile is enforced
  • more strict restricted profile is not enforced, but API server warns about found issues

Generate Talos machine configuration applying the patch above:

talosctl gen config cluster1 https://<IP>:6443/ --config-patch-control-plane @../pod-security-patch.yaml

Deploy Talos using the generated machine configuration.

Verify current admission plugin configuration with:

$ talosctl get kubernetescontrolplaneconfigs apiserver-admission-control -o yaml
node: 172.20.0.2
metadata:
    namespace: config
    type: KubernetesControlPlaneConfigs.config.talos.dev
    id: apiserver-admission-control
    version: 1
    owner: config.K8sControlPlaneController
    phase: running
    created: 2022-02-22T20:28:21Z
    updated: 2022-02-22T20:28:21Z
spec:
    config:
        - name: PodSecurity
          configuration:
            apiVersion: pod-security.admission.config.k8s.io/v1alpha1
            defaults:
                audit: restricted
                audit-version: latest
                enforce: baseline
                enforce-version: latest
                warn: restricted
                warn-version: latest
            exemptions:
                namespaces:
                    - kube-system
                runtimeClasses: []
                usernames: []
            kind: PodSecurityConfiguration

Usage

Create a deployment that satisfies the baseline policy but gives warnings on restricted policy:

$ kubectl create deployment nginx --image=nginx
Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "nginx" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nginx" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "nginx" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "nginx" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/nginx created
$ kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
nginx-85b98978db-j68l8   1/1     Running   0          2m3s

Create a daemonset which fails to meet requirements of the baseline policy:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: debug-container
  name: debug-container
  namespace: default
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: debug-container
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: debug-container
    spec:
      containers:
      - args:
        - "360000"
        command:
        - /bin/sleep
        image: ubuntu:latest
        imagePullPolicy: IfNotPresent
        name: debug-container
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirstWithHostNet
      hostIPC: true
      hostPID: true
      hostNetwork: true
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
$ kubectl apply -f debug.yaml
Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true, hostIPC=true), privileged (container "debug-container" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "debug-container" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "debug-container" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "debug-container" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "debug-container" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
daemonset.apps/debug-container created

Daemonset debug-container gets created, but no pods are scheduled:

$ kubectl get ds
NAME              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
debug-container   0         0         0       0            0           <none>          34s

Pod Security Admission plugin errors are in the daemonset events:

$ kubectl describe ds debug-container
...
  Warning  FailedCreate  92s                daemonset-controller  Error creating: pods "debug-container-kwzdj" is forbidden: violates PodSecurity "baseline:latest": host namespaces (hostNetwork=true, hostPID=true, hostIPC=true), privileged (container "debug-container" must not set securityContext.privileged=true)

Pod Security Admission configuration can also be overridden on a namespace level:

$ kubectl label ns default pod-security.kubernetes.io/enforce=privileged
namespace/default labeled
$ kubectl get ds
NAME              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
debug-container   2         2         0       2            0           <none>          4s

As enforce policy was updated to the privileged for the default namespace, debug-container is now successfully running.