Skip to content

Workload Labeling by Node

The RBLNClusterPolicy is a single CR scoped to the cluster that controls the operator's deployment behavior across all nodes with NPUs. Two node labels let you override the settings for a specific node or exclude that node from deployment without editing the policy itself:

  • rebellions.ai/npu.deploy.skip: exclude a specific node from operator component deployment entirely.
  • rebellions.ai/npu.workload.config: set container or vm-passthrough mode for a specific node, overriding the cluster default.

The operator applies both labels on its next reconciliation loop.

Excluding a Node from Component Deployment

rebellions.ai/npu.deploy.skip=true keeps the node in the cluster and visible to NFD, but prevents the operator from scheduling any components on that node.

Use this in the following cases:

  • When a debug or maintenance node should not run daemons managed by the operator.
  • When a node is managed by a separate tool and should not be managed by the operator.
  • When you need temporary isolation while investigating a hardware or driver issue.
1
2
3
4
5
# Exclude
$ kubectl label node <node-name> rebellions.ai/npu.deploy.skip=true

# Re-include
$ kubectl label node <node-name> rebellions.ai/npu.deploy.skip-

The label value must be exactly true; any other value is ignored.

When the label is applied:

  • Every rebellions.ai/npu.deploy.<component> label on the node is removed, which causes the operator's DaemonSets to remove their pods from that node.
  • The rebellions.ai/npu-driver-upgrade-enabled annotation is cleared, so the node is also excluded from driver upgrades managed by the operator. In other words, npu.deploy.skip includes all behavior from Skipping Driver Upgrades and also excludes the node from component deployment.
  • NFD labels (rebellions.ai/npu.present, rebellions.ai/npu.product, rebellions.ai/npu.family, etc.) are preserved. The device itself remains discoverable.

Selecting a Node's Workload Mode

rebellions.ai/npu.workload.config overrides RBLNClusterPolicy.spec.workloadType for a single node. Allowed values are container and vm-passthrough; unknown values fall back to the spec default.

Use this for hybrid deployments. In this setup, most nodes serve container workloads while a few nodes host KubeVirt VMs with NPU passthrough.

Required chart configuration

For the node override to take effect, both vfioManager and sandboxDevicePlugin must be enabled across the cluster. The CRD intentionally allows the hybrid pattern below, where container is the default while both VM-passthrough components are also enabled:

1
2
3
4
5
6
spec:
  workloadType: container          # cluster default, applied to nodes without the override label
  vfioManager:
    enabled: true                  # also deploy the VM-passthrough component set
  sandboxDevicePlugin:
    enabled: true                  # also deploy the VM-passthrough component set

Label Configuration

1
2
3
4
5
6
# Switch a node to vm-passthrough
$ kubectl label node <vm-node> rebellions.ai/npu.workload.config=vm-passthrough

# Switch back to container, or remove the label entirely to fall back to the spec default
$ kubectl label node <vm-node> rebellions.ai/npu.workload.config=container --overwrite
$ kubectl label node <vm-node> rebellions.ai/npu.workload.config-

Each node with the label receives only the components for its mode:

  • Container nodes: driver, Device Plugin, Container Toolkit, Metrics Exporter, NPU Feature Discovery, Validator (optionally rbln-daemon and the DRA kubelet plugin).
  • vm-passthrough nodes: only vfio-manager and sandbox-device-plugin. The host driver is intentionally absent because the device is bound to vfio-pci for the guest VM.

Switching the mode of a node currently in use

Always drain the node before changing its workload mode. During the mode switch, the device is rebound between the kernel driver and vfio-pci, so any container that is still using the NPU may be disconnected abruptly:

1
2
3
4
$ kubectl cordon worker-3
$ kubectl drain worker-3 --ignore-daemonsets --delete-emptydir-data
$ kubectl label node worker-3 rebellions.ai/npu.workload.config=vm-passthrough --overwrite
$ kubectl uncordon worker-3

Label Precedence

If both labels are set on the same node, rebellions.ai/npu.deploy.skip=true takes precedence. Every npu.deploy.<component> label is removed regardless of workload.config.