Skip to content

Configure Kernel Parameters for RBLN NPUs on OpenShift (Supermicro AMD)

This guide describes how to configure kernel boot parameters required for RBLN NPUs to function correctly under the RBLN NPU Operator on OpenShift.

These parameters are required for stable device operation and must be applied to nodes where NPU workloads are scheduled. Configuration is performed using the OpenShift MachineConfig Operator (MCO).

Without these settings, NPUs may operate incorrectly.


Target Hardware

This guide has been validated on the following configuration:

Item Value
Vendor Supermicro
CPU AMD
NPU RBLN-CA25
Driver RBLN Driver v3.0.0

Kernel Parameters

The following kernel parameters are required for optimal NPU operation:

Parameter Description
transparent_hugepage=madvise Sets THP to madvise mode. THP is applied only to regions explicitly requested by the application via madvise(MADV_HUGEPAGE).
pcie_aspm=force Force-enables PCIe Active State Power Management.
pci=pcie_bus_perf Sets PCIe bus performance optimization mode.
pci=bfsort Enables BFS (Breadth-First Search) sorting for PCI devices.
iommu.strict=1 Enables IOMMU strict mode (immediate IOTLB flush on DMA unmap).

Prerequisites

  • OpenShift Container Platform 4.x cluster
  • cluster-admin privileges
  • OpenShift CLI (oc) installed and logged in
  • (Optional) Node Feature Discovery Operator v0.16+ — required for automatic hardware identification

Hardware Identification

Before applying MachineConfig, verify that the target node is a Supermicro AMD server.

Manual Verification

1
2
3
4
5
# Check system manufacturer
oc debug node/<NODE_NAME> -- chroot /host dmidecode -s system-manufacturer

# Check CPU model
oc debug node/<NODE_NAME> -- chroot /host bash -c "grep 'model name' /proc/cpuinfo | head -1"

Expected output:

Supermicro
model name : AMD EPYC ...

(Optional) Automatic Identification via NFD NodeFeatureRule

With NFD v0.16+, the system.dmiid.sys_vendor feature can be used to read DMI vendor information from nodes. A NodeFeatureRule can be used to automatically assign custom labels to matching nodes.

Note

NFD does not automatically create labels for system.dmiid. A NodeFeatureRule must be defined.

1
2
3
4
References:

- [NFD v0.16 Release Notes](https://github.com/kubernetes-sigs/node-feature-discovery/releases/tag/v0.16.0)
- [NFD Customization Guide — system.dmiid](https://kubernetes-sigs.github.io/node-feature-discovery/v0.16/usage/customization-guide.html)

Create a NodeFeatureRule

apiVersion: nfd.k8s-sigs.io/v1alpha1
kind: NodeFeatureRule
metadata:
  name: smci-amd-detection
spec:
  rules:
    - name: "detect-supermicro"
      labels:
        hardware.io/vendor-supermicro: "true"
      matchFeatures:
        - feature: system.dmiid
          matchExpressions:
            sys_vendor: {op: In, value: ["Supermicro"]}
    - name: "detect-amd-cpu"
      labels:
        hardware.io/cpu-amd: "true"
      matchFeatures:
        - feature: cpu.model
          matchExpressions:
            vendor_id: {op: In, value: ["AMD"]}
oc apply -f <NFR_YAML_FILE>

Verify Labels

After applying, confirm that the following labels are automatically assigned to the target node:

oc get node <NODE_NAME> -o jsonpath='{.metadata.labels}' | python3 -m json.tool | grep hardware.io

Expected output:

"hardware.io/vendor-supermicro": "true",
"hardware.io/cpu-amd": "true",

Use NFD Labels as MCP nodeSelector

Once NFD labels are active, they can be used directly in the MCP nodeSelector without manual node-role labeling. In that case, update the nodeSelector in the MCP YAML in Create Custom MachineConfigPool as follows:

1
2
3
4
  nodeSelector:
    matchLabels:
      hardware.io/vendor-supermicro: "true"
      hardware.io/cpu-amd: "true"

With this configuration, Supermicro AMD nodes added to the cluster are automatically included in the MCP, and kernel parameters are applied without manual labeling.


Applying Kernel Arguments

Warning

If the MachineConfig machineconfiguration.openshift.io/role label is set to worker, it will be applied to the default worker MCP. This means the configuration will be applied to all worker nodes in the cluster, triggering node reboots.

To limit the scope, create a custom MCP that targets only specific nodes (for example, Supermicro AMD servers), and set the MachineConfig role label to that MCP name.

In the steps below, replace __MCP_NAME__ with a name appropriate for your environment (e.g., smci-ca25-amd).

Pre-check

# Check node status
oc get nodes

# Check target node labels
oc get node <NODE_NAME> --show-labels

# List existing MCPs
oc get mcp

# Check current kernel parameters (record baseline before applying)
oc debug node/<NODE_NAME> -- chroot /host cat /proc/cmdline

Create Custom MachineConfigPool

A custom MCP allows kernel parameters to be applied only to selected nodes.

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: __MCP_NAME__
spec:
  machineConfigSelector:
    matchExpressions:
      - key: machineconfiguration.openshift.io/role
        operator: In
        values:
          - worker
          - __MCP_NAME__
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/__MCP_NAME__: ""

Note

If using NFD-based node identification, you can replace the nodeSelector with NFD-generated labels (see Use NFD Labels as MCP nodeSelector).

1
2
3
4
5
# Create MCP
oc apply -f <MCP_YAML_FILE>

# Assign role label to target node (skip if using NFD automatic identification)
oc label node <NODE_NAME> node-role.kubernetes.io/__MCP_NAME__=""

Verify MCP status:

oc get mcp __MCP_NAME__
# Confirm UPDATED=True, DEGRADED=False

Apply MachineConfig

MachineConfig Name Convention (99- prefix)

MCO sorts MachineConfigs lexicographically by name and merges them sequentially to produce the final rendered config. The numeric prefix determines merge priority:

Prefix Purpose
00- OpenShift defaults (e.g., 00-worker)
01- Base runtime/kubelet settings
50- Operational custom settings
99- User custom settings (merged last, overrides existing settings)

Using the 99- prefix ensures that kernel parameters are merged after OpenShift default MCs and are reliably reflected in the final config.

Idempotency

MCO operates declaratively when applying MachineConfigs. If the same MC is re-applied or if a MC with identical kernel parameters already exists:

  • MCO compares the current rendered config with the new rendered config.
  • No node reboot occurs if there are no changes.
  • The MCP status remains UPDATED=True.

This means re-applying the same MC when the parameters are already in place is safely ignored without unnecessary reboots.

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  name: 99-grub-kargs-amd-smci-ca25
  labels:
    machineconfiguration.openshift.io/role: __MCP_NAME__
spec:
  kernelArguments:
    - transparent_hugepage=madvise
    - pcie_aspm=force
    - pci=pcie_bus_perf
    - pci=bfsort
    - iommu.strict=1
oc apply -f <MC_YAML_FILE>

Monitor Rollout

After applying MachineConfig, MCO automatically cordons → drains → reboots the target node.

1
2
3
4
5
6
7
8
# Watch MCP status (wait until UPDATING=True → False)
watch oc get mcp __MCP_NAME__

# Watch node status (SchedulingDisabled → Ready)
watch oc get nodes

# Check rendered MC
oc get mc | grep rendered-__MCP_NAME__

Verify

Verify Kernel Parameters

oc debug node/<NODE_NAME> -- chroot /host cat /proc/cmdline

Confirm that the following parameters are present in the cmdline output:

  • transparent_hugepage=madvise
  • pcie_aspm=force
  • pci=pcie_bus_perf
  • pci=bfsort
  • iommu.strict=1

Verify THP Status

oc debug node/<NODE_NAME> -- chroot /host cat /sys/kernel/mm/transparent_hugepage/enabled

Expected output:

always [madvise] never

Verify Final MCP Status

oc get mcp __MCP_NAME__
Field Expected
UPDATED True
UPDATING False
DEGRADED False

Rollback

If issues arise, deleting the MachineConfig restores the original kernel parameters.

Warning

Follow the deletion order below strictly. Incorrect order may leave nodes in a Degraded state. (Recovery: see Degraded Recovery)

Rollback Procedure

Deletion order:

Delete MC → Wait for reboot to complete → Remove node label → Delete MCP

Step 1. Delete MachineConfig

oc delete mc 99-grub-kargs-amd-smci-ca25

Step 2. Wait for rollout to complete

The MCO re-renders the configuration without the kernel arguments and reboots the node.

Wait until the rollout is fully complete before proceeding:

watch oc get mcp __MCP_NAME__
# Confirm UPDATED=True, UPDATING=False, DEGRADED=False

Step 3. Confirm kernel parameters are removed

Confirm that the previously applied kernel parameters are no longer present:

oc debug node/<NODE_NAME> -- chroot /host cat /proc/cmdline

### Cleanup

After rollback is complete, remove the node label and delete the MCP.

**Step 4.** Remove node label

```bash
oc label node <NODE_NAME> node-role.kubernetes.io/__MCP_NAME__-

Step 5. Delete MCP

oc delete mcp __MCP_NAME__

Troubleshooting

Degraded Recovery

Symptom: If the MCP or node label is deleted before the MC, the node's currentConfig annotation references an already-deleted rendered MachineConfig, leaving the node in a Degraded state.

Node <NODE_NAME> is reporting: "missing MachineConfig rendered-<MCP>-xxxxx"

Cause: When deleting an MC, MCO only re-renders for nodes belonging to that MCP. If the node has left the MCP before the MC is deleted, MCO has no target node to process, and the kernel parameters remain in place.

Recovery Procedure:

# 1. Recreate MCP
oc apply -f <MCP_YAML_FILE>

# 2. Re-assign label to node
oc label node <NODE_NAME> node-role.kubernetes.io/__MCP_NAME__=""

# 3. Get new rendered config name
oc get mcp __MCP_NAME__ -o jsonpath='{.spec.configuration.name}'

# 4. Force-update node's currentConfig annotation to new rendered config
oc annotate node <NODE_NAME> \
  machineconfiguration.openshift.io/currentConfig=<NEW_RENDERED_CONFIG> \
  machineconfiguration.openshift.io/desiredConfig=<NEW_RENDERED_CONFIG> \
  --overwrite

# 5. Restart MCD pod to reset stale state
MCD_POD=$(oc get pods -n openshift-machine-config-operator -o wide \
  | grep <NODE_NAME> | grep machine-config-daemon | awk '{print $1}')
oc delete pod $MCD_POD -n openshift-machine-config-operator

# 6. Confirm MCP is healthy
watch oc get mcp __MCP_NAME__
# Confirm UPDATED=True, DEGRADED=False

# 7. After recovery, clean up in correct order (Rollback Procedure → Cleanup)