Configuring Kubewarden stack for production
Kubewarden provides features for reliability and correct scheduling of its components in a Kubernetes cluster. Some of the hints on this page come from Kubewarden community members using Kubewarden at scale.
If you want to see a real example of running Kubewarden at scale check out the Kubewarden in a Large-Scale Environment documentation page
Configuring Tolerations and Affinity/Anti-Affinity​
By using the tolerations
and affinity
fields, operators can fine-tune the
scheduling and reliability of the Kubewarden stack to meet their specific
deployment needs and constraints. For more details on the exact fields and
their configurations, refer to the Kubernetes documentation on Taints and
Tolerations
and Affinity and
Anti-Affinity.
Starting from version 1.15 of the Kubewarden stack, the Kubewarden Helm charts ship with two new values:
.global.tolerations
.global.affinity
These Helm chart values allow users to define Kubernetes tolerations and
affinity/anti-affinity settings for the Kubewarden stack, including the
controller deployment, audit scanner cronjob, and the default PolicyServer
custom
resource.
Tolerations​
The tolerations
value is an array where users can specify Kubernetes
tolerations for the Kubewarden components. Tolerations allow pods to be
scheduled on nodes with matching taints. This is useful for managing where pods
can be scheduled, especially in scenarios involving node maintenance, dedicated
workloads, or specific hardware requirements:
global:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
- key: "key2"
operator: "Equal"
value: "value2"
effect: "NoExecute"
In this example, the tolerations defined are applied to the controller
deployment, audit scanner cronjob, and the default PolicyServer
custom resource.
Affinity/Anti-Affinity​
The affinity
value allows users to define Kubernetes affinity and
anti-affinity rules for the Kubewarden components. Affinity rules constrain
pods to specific nodes, while anti-affinity rules prevent pods from being
scheduled on certain nodes or in close proximity to other pods. These settings
are useful for ensuring high availability, fault tolerance, and optimized
resource usage in a cluster.
global:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: topology.kubernetes.io/zone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S2
topologyKey: topology.kubernetes.io/zone
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: label-1
operator: In
values:
- key-1
- weight: 50
preference:
matchExpressions:
- key: label-2
operator: In
values:
- key-2
In this example, the affinity rules will be applied to the controller
deployment, audit scanner cronjob, and the default PolicyServer
custom resource.
The previous affinity configuration available in the kubewarden-default
Helm
chart, which was used to define the affinity configuration for the PolicyServer
only, has been removed in favor of the global affinity
value. This change
simplifies the configuration process by providing a single approach to
defining affinity and anti-affinity rules for all Kubewarden components.
affinity
configuration in the kubewarden-default
Helmchart has been removed. Users should now use the
.global.affinity
field to configure affinity and anti-affinity settings for
the entire Kubewarden stack.
Configuring priorityClasses​
By using priorityClasses, operators can enforce a scheduling priority for the workload pods of the Kubewarden stack. This ensures the Kubewarden workload is available over other workloads, preventing eviction and ensuring service reliability. For more information, refer to the Kubernetes documentation on Priorityclasses.
Starting from version 1.25 of the Kubewarden stack, the Kubewarden Helm charts ship with a new value:
.global.priorityClassName
The priorityClass defined by name in this value is applied to the controller
deployment pods, and the pods of the default PolicyServer
custom resource.
The .global.priorityClassName
value expects a name of an existing PriorityClass.
As an example, we could use:
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: kubewarden-high-priority
value: 1000000
globalDefault: false
description: "This priority class should be used for XYZ service pods only."
Kubernetes already ships with two PriorityClasses that are good candidates:
system-cluster-critical
and system-node-critical
. These are common classes
and are used to ensure that critical components are always scheduled
first.
If you delete a PriorityClass, existing Pods that use the name of the deleted PriorityClass remain unchanged, but following Pods that use the name of the deleted PriorityClass will not be created by Kubernetes.
PolicyServer
production configuration​
PolicyServers
are critical to the cluster. Reliability of them is important as
they process Admission Requests destined for the Kubernetes API via the Validating and
Mutating Webhooks.
As with other Dynamic Admission Controllers, this process happens before requests reach the Kubernetes API server. Latency or service delays by the Dynamic Admission Controller may introduce cluster inconsistency, Denial of Service, or deadlock.
Kubewarden provides several ways to increase the reliability of PolicyServers
.
Production deployments can vary a great deal, it is up to the operator to configure the deployment for their needs.
PodDisruptionBudgets​
The Kubewarden controller can create a
PodDisruptionBudget
(PDB) for the PolicyServer
Pods. This controls the range of PolicyServer
Pod replicas associated with the PolicyServer
, ensuring high availability
and controlled eviction in case of node maintenance, scaling operations or
cluster upgrades.
This is achieved by setting spec.minAvailable
, or spec.maxUnavailable
of the
PolicyServer
resource:
-
minAvailable
: specifies the minimum number ofPolicyServer
Pods that must be available at all times. Can be an integer or a percentage.Useful for maintaining the operational integrity of the
PolicyServer
, ensuring that policies are continuously enforced without interruption. -
maxUnavailable
: specifies the maximum number ofPolicyServer
Pods that can be unavailable at any given time. Can be an integer or a percentage.Useful for performing rolling updates or partial maintenance without fully halting the policy enforcement mechanism.
You can specify only one of maxUnavailable
and minAvailable
.
Configuring minAvailable or maxUnavailable​
Examples:
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
minAvailable: 2 # ensure at least two policy-server Pods are available at all times
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
maxUnavailable: "30%" # ensure no more than 30% of policy-server Pods are unavailable at all times
Affinity / Anti-affinity​
The Kubewarden controller can set the affinity of PolicyServer
Pods. This
allows constraint of Pods to specific nodes, or Pods against other Pods. For
more information on Affinity, see the Kubernetes
docs.
Kubernetes affinity configuration allows constraining Pods to nodes (via
spec.affinity.nodeAffinity
) or constraining Pods with regards to other Pods
(via spec.affinity.podAffinity
). Affinity can be set as a soft constraint
(with preferredDuringSchedulingIgnoredDuringExecution
) or a hard one (with
requiredDuringSchedulingIgnoredDuringExecution
).
Affinity / anti-affinity matches against specific labels, be it nodes' labels
(e.g: topology.kubernetes.io/zone
set to antarctica-east1
) or Pods labels.
Pods created from PolicyServer
definitions have a label
kubewarden/policy-server
set to the name of the PolicyServer
. (e.g:
kubewarden/policy-server: default
).
Inter-pod affinity/anti-affinity require substantial amounts of processing and are not recommended in clusters larger than several hundred nodes.
To configure affinity for a PolicyServer
, set its spec.affinity
field. This
field accepts a YAML object matching the contents of a Pod's spec.affinity
.
Configuring Affinity / Anti-affinity​
Example: Spread the PolicyServer
Pods across zones and hostnames
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: kubewarden/policy-server
operator: In
values:
- your-policy-server
topologyKey: topology.kubernetes.io/zone
- labelSelector:
matchExpressions:
- key: kubewarden/policy-server
operator: In
values:
- your-policy-server
topologyKey: kubernetes.io/hostname
Example: Only schedule PolicyServer
pods in control-plane nodes
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: kubewarden/policy-server
operator: In
values:
- your-policy-server
topologyKey: node-role.kubernetes.io/control-plane
Limits and Requests​
The Kubewarden controller can set the resource limits and requests of
PolicyServer
Pods. This specifies how much of each resource each of the
containers associated with the PolicyServer
Pods needs. For PolicyServers
,
only cpu
and memory
resources are relevant. See the Kubernetes
docs
on resource units for more information.
This is achieved by setting the following PolicyServer
resource fields:
-
spec.limits
: Limits on resources, enforced by the container runtime. Different runtimes can have different ways to implement the restrictions. -
spec.requests
: Amount of resources to reserve for each container. It is possible and allowed for a container to use more resource than it'srequest
.If omitted, it defaults to
spec.limits
if that is set (unlessspec.requests
of containers is set to some defaults via an admission mechanism).
Undercommitting resources of PolicyServers
may cause reliability issues in the
cluster.
Configuring Limits and Requests​
Example: Set hard limits for each PolicyServer
container
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
limits:
cpu: 500m
memory: 1Gi
PriorityClasses​
The Kubewarden controller can set the PriorityClass used for the pods of
PolicyServers
. This means PolicyServer
workloads are scheduled with priority,
preventing eviction and ensuring service reliability. See the Kubernetes docs
for more
information.
If you delete a PriorityClass, existing Pods that use the name of the deleted PriorityClass remain unchanged, but following Pods that use the name of the deleted PriorityClass will not be created by Kubernetes.
Configuring PriorityClasses​
Example: Using the default system-cluster-critical
priorityClass:
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
priorityClassName: system-cluster-critical
Isolate Policy Workloads​
To ensure stability and high performance at scale, users can run separate
PolicyServer
deployments to isolate different workloads.
- Dedicate one
PolicyServer
to Context-Aware Policies: These policies are more resource-intensive because they query the Kubernetes API server or other external services like Sigstore, OCI registries, among others. Isolating them prevents a slow policy from creating a bottleneck for other, faster policies. - Use another
PolicyServer
for All Other Policies: Run regular, self-contained policies on a separate server to ensure low latency for the most common admission requests.
You can also considering splitting even further the workload. For example, if
you have some policies that are slow and require a bigger execution timeout,
consider move them into a dedicated PolicyServer
. This way you ensure that
policies will not block the workers to evaluation other requests.
Resource Allocation and Scaling​
To handle high traffic and ensure availability, provide sufficient resources and scale your replicas.
- Allocate Sufficient Resources: In high-traffic environments, allocate
generous resources to each replica. Do not starve the
PolicyServers
, as insufficient CPU or memory is a primary cause of request timeouts. Remeber thatPolicyServers
will receive requests from control plane and the Kubewarden audit scanner - Scale for High Availability: For deployments handling hundreds of requests per second, run a high number of replicas. This distributes the load effectively and ensures that the failure of a few pods does not impact the cluster's operation.
Start with a baseline of 3-5 replicas and monitor CPU and memory usage. Scale the replica count as needed.
Effective Auditing at Scale​
To run audits efficiently on large clusters, fine-tune the audit scanner for performance and parallelism.
- Schedule Audits Periodically: Running a scan frequently can be a good balance between catching configuration drift and minimizing load on the API server.
- Tune Parallelism Aggressively: The key to fast audits is parallelization. With high-parallelism settings, you can reduce audit times on massive clusters to just over an hour.
It's important to remember that audit scanner sends requests to
PolicyServers
. Therefore, its parallelism can impact on PolicyServer
performance. If you want to have an aggressive parallelism to reduce
the scan times in big clusters, you may need to increase the policy server
resouces available to avoid impacting the admission controller performance
Set disableStore: true
to reduce load if you consume audit results from logs
and do not require PolicyReport
custom resources in the cluster.