Kubernetes Hardening for Defense Workloads: CIS Benchmarks and NSA Guidelines

Kubernetes has become the standard deployment platform for containerized applications in defense, from DoD Platform One to national defense cloud programs across NATO member states. Its adoption in defense environments is driven by the same operational advantages that drove commercial adoption: consistent deployment across environments, automated scaling, self-healing workloads, and declarative infrastructure management. But a default Kubernetes installation is designed for ease of use, not for security — and the security delta between a default installation and a hardened, defense-grade deployment is significant.

NSA and CISA published the Kubernetes Hardening Guide (v1.2, August 2022) specifically to address this gap. The guide covers control plane hardening, network security, pod security, audit logging, and authentication and authorization — providing a practical starting point for defense Kubernetes deployments. The CIS Kubernetes Benchmark provides a complementary, more granular set of scored configuration checks. Together, these two documents define what "hardened" means for Kubernetes in a defense context.

NSA/CISA Kubernetes Hardening Guide: Key Recommendations

The NSA/CISA guide organizes recommendations into six categories. The most operationally significant for defense are:

Kubernetes pod security. Pods should use non-root containers, read-only root filesystems, and have capabilities dropped to the minimum required. Privileged containers — which have full access to the host system — must not be permitted in production workloads.

Network separation and hardening. All inter-service traffic should be encrypted using TLS (service mesh with mTLS). Network Policies should restrict pod-to-pod communication to explicitly permitted paths. The Kubernetes API server should not be directly accessible from the internet or from untrusted network segments.

Authentication and authorization. The API server must not enable anonymous authentication or insecure ports. Role-Based Access Control (RBAC) must be enabled and configured following least-privilege principles. Service account tokens should not be automatically mounted on pods that do not require API server access.

Audit logging. API server audit logging must be enabled with a policy that captures at least create, update, delete, and get operations on sensitive resource types. Audit logs must be forwarded to a central SIEM or log management system where they cannot be modified by cluster administrators.

Upgrade frequency. Kubernetes versions receive security patches for approximately 14 months after release. Running unsupported Kubernetes versions is a significant security risk that is unacceptable in defense deployments.

Pod Security Standards: The Restricted Profile

Kubernetes Pod Security Standards (PSS) define three policy profiles — Privileged, Baseline, and Restricted — that represent increasing levels of restriction on pod configuration. The Restricted profile is the appropriate baseline for defense workloads: it enforces the most security-relevant pod configuration constraints.

The Restricted profile disallows: privileged containers (containers with the privileged flag set to true), containers running as root (enforced through runAsNonRoot: true), containers accessing the host network (hostNetwork: false), containers accessing host PIDs or IPCs (hostPID: false, hostIPC: false), containers mounting host paths as volumes, and containers that add Linux capabilities beyond a defined allowlist.

Implementing the Restricted profile in an existing Kubernetes environment often requires application changes: applications that were written assuming root access, write access to the filesystem, or host network access must be refactored to work within the Restricted profile. For new defense application development, Restricted profile compliance should be a design requirement from the start — retrofitting it after deployment is significantly more expensive.

Pod Security Standards are enforced through the built-in PodSecurity admission controller (available since Kubernetes 1.25, replacing the deprecated PodSecurityPolicy). Enforcement modes are Enforce (pods violating the policy are rejected), Audit (violations are logged but pods are allowed), and Warn (violations generate API warnings but pods are allowed). Defense deployments should use Enforce mode for the Restricted profile in all production namespaces.

Network Policies: Microsegmentation with Calico or Cilium

Kubernetes Network Policies define which pods can communicate with which other pods at the IP/port level. Without Network Policies, all pods in a cluster can communicate with all other pods — a flat network topology that is architecturally incompatible with zero-trust principles. Network Policies implement the microsegmentation layer at the container network level.

Calico is the most widely deployed Kubernetes network policy implementation, supporting both standard Kubernetes NetworkPolicy resources and Calico-specific GlobalNetworkPolicy and NetworkPolicy resources with additional capabilities. Calico can be deployed in several modes (BGP routing, VXLAN overlay, or eBPF dataplane) and integrates with external firewalls through BGP route advertisement. For air-gapped defense environments, Calico's on-premises deployment model and lack of cloud control plane dependencies make it operationally suitable.

Cilium uses eBPF (Extended Berkeley Packet Filter) for network policy enforcement in the Linux kernel, providing higher performance than iptables-based solutions and supporting Layer 7 (application-layer) network policies — for example, allowing HTTP GET requests but blocking POST requests on a specific API path. Cilium's Hubble observability component provides detailed visibility into network flows, supporting both security monitoring and troubleshooting. Cilium's integration with SPIFFE/SPIRE for workload identity provides a path toward mTLS-based microsegmentation without a full service mesh deployment.

The key principle for defense Network Policy design is default-deny: new namespaces should have a default-deny NetworkPolicy that blocks all ingress and egress until explicit allow rules are created. This ensures that new workloads are isolated until their network access requirements are explicitly documented and approved, rather than inheriting the permissive default.

Admission Controllers: Policy-as-Code with OPA/Gatekeeper and Kyverno

Admission controllers are plugins that intercept API server requests before they are persisted to etcd, allowing policies to be enforced at the cluster API level. OPA/Gatekeeper and Kyverno are the two dominant policy-as-code frameworks for Kubernetes admission control.

OPA/Gatekeeper uses the OPA (Open Policy Agent) policy engine with Rego policy language. Gatekeeper registers as a ValidatingAdmissionWebhook that calls the OPA policy engine for each relevant API request. Constraint Templates define the policy structure; Constraints instantiate the template for specific resources. The OPA/Gatekeeper ecosystem has a large library of pre-built policies covering common security requirements, and custom policies can be written in Rego for organization-specific requirements.

Kyverno uses Kubernetes-native YAML to express policies, making it more accessible to teams familiar with Kubernetes resource definitions but not comfortable with Rego. Kyverno supports both validation (blocking non-compliant resources) and mutation (automatically adding required labels or security context fields to resources that are missing them). Kyverno's mutating admission webhook capability is particularly useful for applying security defaults automatically, reducing the burden on application developers.

For defense deployments, admission control policies should enforce at minimum: image provenance (images must come from approved registries), image signing (images must have valid cosign signatures), security context requirements (non-root, no privileged, no host namespaces), required labels (for asset tracking and compliance reporting), and resource limits (all containers must have CPU and memory limits defined).

Runtime Security: Falco, seccomp, and AppArmor

Falco (CNCF-graduated) is the standard runtime security tool for Kubernetes: it monitors kernel system calls in real time and generates alerts when behavior matches suspicious patterns. Falco rules cover process execution (unexpected executables running inside containers), file access (writes to system directories, reads of sensitive files), network activity (unexpected outbound connections from containers), and Kubernetes API activity (unauthorized API calls, credential theft attempts). Falco integrates with SIEM systems via syslog or Webhook output, feeding container runtime events into the broader security monitoring infrastructure.

seccomp (secure computing mode) profiles restrict the system calls available to container processes. A process in a container running with a seccomp profile can only make the system calls explicitly allowed by that profile — all others are blocked. Kubernetes provides a default seccomp profile (RuntimeDefault) that blocks the most dangerous system calls while allowing normal application operation. Defense workloads should use the RuntimeDefault profile at minimum; high-risk workloads should use custom profiles that are even more restrictive.

AppArmor (on Linux distributions that support it) provides a Mandatory Access Control layer that restricts what files, capabilities, and network operations each process can access. AppArmor profiles for Kubernetes containers define what the container process is allowed to do, adding a defense-in-depth layer beneath the container runtime and above the kernel.

Key insight: Kubernetes hardening is not a one-time configuration activity — it is an ongoing posture management discipline. Cluster configurations drift over time (manual changes, helm chart upgrades that introduce new resource types, new application deployments with non-compliant configurations). Continuous compliance scanning (using kube-bench for CIS benchmark checks, Polaris for best-practice policy checks, or Trivy's misconfiguration scanning for both) must be integrated into the operational workflow to detect and remediate configuration drift before it becomes a security incident.