Skip to content

Device Monitoring (rbln-smi)

rbln-smi is a command-line interface (CLI) utility for monitoring and managing RBLN NPUs. It supports:

  • NPU status monitoring (temperature, power, utilization, memory)
  • Context/process inspection
  • System topology inspection
  • RSD group management and group-level settings

rbln-smi is included in the RBLN Driver package. For the full, version-specific option reference, run rbln-smi --help.

Note

rbln-stat is deprecated and replaced by rbln-smi. Existing scripts using rbln-stat may still work, but new users should use rbln-smi.

Quick Start

$ rbln-smi

Expected output (example)

Monitor output (example)
+-------------------------------------------------------------------------------------------------+
|                                 Device Information KMD ver: N/A                                 |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| NPU |    Name   | Device  |   PCI BUS ID  | Temp |  Power  | Perf |  Memory(used/total) |  Util |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 0   | RBLN-CA12 | rbln0   |  0000:51:00.0 |  38C |  43.9W  | P2   |   2.4GB / 15.7GiB   |  98.7 |
| 1   | RBLN-CA12 | rbln1   |  0000:d8:00.0 |  25C |   6.1W  | P14  |    0.0B / 15.7GiB   |   0.0 |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------+
|                                       Context Information                                       |
+-----+---------------------+--------------+-----+----------+------+---------------------+--------+
| NPU | Process             |     PID      | CTX | Priority | PTID |            Memalloc | Status |
+-----+---------------------+--------------+-----+----------+------+---------------------+--------+
| 0   | python3             |   2928727    |  1  |   min    |  0   |              1.9GiB |  run   |
| 0   | python3             |   2930166    |  2  |   min    |  1   |            468.0MiB |  idle  |
| 0   | python3             |   2934705    |  3  |   min    |  2   |             88.0MiB |  idle  |
+-----+---------------------+--------------+-----+----------+------+---------------------+--------+

Key Concepts and Terminology

Device selection

  • Use -d, --device <ids> to target specific NPUs (comma-separated list or range).
  • Output refers to device labels (the Device column) such as rbln0, rbln1.

Output formats

  • Table (default): human-readable summary for devices and contexts.
  • JSON (-j): machine-readable output.
  • Query (-q): space-separated (CSV-like) output suitable for scripts.

Common columns and performance state

The default monitor output typically includes:

Column Meaning
Name NPU product name (for example: RBLN-CA25).
Power Power consumption.
Perf Performance state (P-state).
Temp Temperature.
Util Utilization.
PID Process ID.
CTX Context ID.
Memalloc Allocated memory.

For the authoritative product/family mapping and supported card list, see Support Matrix.

The Perf column reports a P-state. A common interpretation is:

P-state Clock (Neural Engine) PCIe Note
P2 Nominal Gen5 -
P4 Nominal Gen4 -
P6 Half Gen4 -
P10 Half (No update) Thermal Throttling
P12 Minimal (No update) System Abort (Hang)
P14 Off (No update) Idle

Command Reference

Note

Some operations require sudo. In particular, subcommands such as group, tdr, timeout, and sort require sudo.

General usage

The basic invocation pattern is as follows:

$ rbln-smi [global options]
$ rbln-smi --topo [options]
$ sudo rbln-smi <subcommand> [arguments]

Global options

The following options are available across all command modes:

Option Description
-h, --help Display help information. Subcommand help requires sudo.
-b, --byte-format Display values in raw units instead of human-readable units.
-j, --json Render the result as JSON.
-q, --query Print data in a space-separated (CSV-like) format.
-qd, --query-device <columns> Select specific device columns when using query mode.
-qc, --query-context <columns> Select specific context columns when using query mode.
-t, --topo Show device/system topology (kernel 6.2 or later recommended).
-L, --list List NPUs and their UUIDs.
-d, --device <ids> Choose NPUs by comma-separated list or range.
-g, --group Display output organized by RSD groups.
-v, --version Print version information and exit.

CLI Examples

Basic commands

Summary

Default view. Shows a snapshot of device and context information.

Command

Command
$ rbln-smi [options]

Output (excerpt)

Monitor (excerpt)
Mon Nov 17 14:15:26 2025
+-------------------------------------------------------------------------------------------------+
|                        Device Information KMD ver: 2.1.0~dev.107+gafec0b9                       |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| NPU |    Name   | Device  |   PCI BUS ID  | Temp |  Power  | Perf |  Memory(used/total) |  Util |
+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0   | RBLN-CA25 | rbln0   |  0000:47:00.0 |  37C |  75.9W  |  P2  |  90.0MiB / 15.7GiB  |  50.0 |
| 1   |           | rbln1   |  0000:48:00.0 |  26C |         | P14  |    0.0B / 15.7GiB   |   0.0 |
+-------------------------------------------------------------------------------------------------+
|                                       Context Information                                       |
+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| NPU | Process             |     PID      |    CTX    | Priority | PTID |      Memalloc | Status |
+=====+=====================+==============+===========+==========+======+===============+========+
| 0   | command_submission  |    257082    |   10001   |   min    |  0   |       90.0MiB |  run   |

Summary

Prints the result in JSON format.

Command

Command
$ rbln-smi -j

Output (excerpt)

JSON output (excerpt)
{
  "KMD_version": "2.1.0~dev.107+gafec0b9",
  "devices": [
    {
      "npu": 0,
      "name": "RBLN-CA25",
      "sid": "SAMPLE_SID_0000",
      "uuid": "11111111-2222-3333-4444-555555555555",
      "device": "rbln0",
      "temperature": "24C",
      "card_power": "41448540uW",
      "pstate": "P14",
      "memory": { "used": "0", "total": "16877879296" },
      "util": "0.0"
    }
  ],
  "contexts": []
}

Summary

Prints the result in a space-separated (CSV-like) format.

Command

Command
$ rbln-smi -q

Output (excerpt)

Query output (excerpt)
driver_version:
  2.1.0~dev.107+gafec0b9
devices:
 npu      name        sid             uuid                                  device status ...
   0 RBLN-CA25 SAMPLE_SID_0000 11111111-2222-3333-4444-555555555555          rbln0 normal ...
   1 RBLN-CA25 SAMPLE_SID_0001 66666666-7777-8888-9999-AAAAAAAAAAAA          rbln1 normal ...

Summary

Restricts output to the specified NPUs only.

Command

Command
$ rbln-smi -d 0,1

Output (excerpt)

Device filter (excerpt)
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| NPU |    Name   | Device  |   PCI BUS ID  | Temp |  Power  | Perf |  Memory(used/total) |  Util |
+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0   | RBLN-CA25 | rbln0   |  0000:05:00.0 |  24C |  41.6W  | P14  |    0.0B / 15.7GiB   |   0.0 |
| 1   |           | rbln1   |  0000:06:00.0 |  25C |         | P14  |    0.0B / 15.7GiB   |   0.0 |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+

Summary

Prints device/system topology information (distance matrix and related details).

Command

Command
$ rbln-smi --topo [--device <ids>]

Output (excerpt)

Topology (example)
Hardware Topology
Device Distance  n0  n1  n2  n3
rbln0        n0   0   4   4   4
rbln1        n1   4   0   4   4
rbln2        n2   4   4   0   4
rbln3        n3   4   4   4   0

Summary

Lists NPUs and their UUIDs.

Command

Command
$ rbln-smi -L

Output (excerpt)

-L (excerpt)
NPU 0: UUID: 11111111-2222-3333-4444-555555555555 (example)
NPU 1: UUID: 66666666-7777-8888-9999-AAAAAAAAAAAA (example)

Subcommands (sudo)

Summary

Create or destroy RSD groups.

Command

Command
$ sudo rbln-smi group [-c <group_id> -a <npu_ids>] [-d <group_id>]

Options

Argument Description
-c, --create <group_id> Create a new RSD group (use with -a). Specifying all assigns one group per device.
-a, --attach <npu_ids> Attach NPUs (comma-separated) to the new group.
-d, --destroy <group_id> Remove an RSD group. Specifying all removes all groups and merges them into the default group 0.

Example session

Group workflow (example)
$ sudo rbln-smi group -h
usage: rbln-smi group [-h] [-c GROUP_ID -a NPU_ID[,NPU_ID...]] [-d GROUP_ID]

$ sudo rbln-smi group -c 1 -a 0,1
# (no output on success)

$ rbln-smi --group
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU |    Name   | Device  |   PCI BUS ID  | Temp |  Power  | Perf |  Memory(used/total) |  Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 1   | 0   | RBLN-CA25 | rbln0   |  0000:05:00.0 |  23C |  41.0W  | P14  |    0.0B / 15.7GiB   |   0.0 |
| 1   | 1   |           | rbln1   |  0000:06:00.0 |  25C |         | P14  |    0.0B / 15.7GiB   |   0.0 |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+

$ sudo rbln-smi group -d 1
# (no output on success)

$ sudo rbln-smi group -c all
# Assigns one group per device

$ sudo rbln-smi group -d all
# Removes all groups and merges them into the default group 0

Summary

Set the TDR value for an RSD group.

Command

Command
$ sudo rbln-smi tdr -g <group_ids> -v <value>

Notes

On success, this command typically prints no output.

Output (example)

TDR set (example)
$ sudo rbln-smi tdr -g 1 -v 10
# (no output on success)

Summary

Adjust timeout values for an RSD group.

Command

Command
$ sudo rbln-smi timeout -g <group_ids> -v <value>

Notes

On success, this command typically prints no output.

Output (example)

Timeout set (example)
$ sudo rbln-smi timeout -g 1 -v 10
# (no output on success)

Summary

Sort NPU devices by PCI BDF and rebind them.

Command

Command
$ sudo rbln-smi sort

Notes

On success, this command typically prints no output.

Output (example)

Sort (example)
$ sudo rbln-smi sort
# (no output on success)

Troubleshooting

Permission denied / subcommand requires sudo

Run the command with sudo (for example: sudo rbln-smi group -h).

No devices are shown

  • Confirm the driver is installed and NPUs are detected.
  • Try rbln-smi -L to list devices.

Topology output is missing

If --topo is not available or incomplete, check kernel/version requirements and try again with specific devices via --device <ids>.

See also