Device Monitoring (rbln-smi)¶
rbln-smi is a command-line interface (CLI) utility for monitoring and managing RBLN NPUs. It supports:
- NPU status monitoring (temperature, power, utilization, memory)
- Context/process inspection
- System topology inspection
- RSD group management and group-level settings
rbln-smi is included in the RBLN Driver package. For the full, version-specific option reference, run rbln-smi --help.
Note
rbln-stat is deprecated and replaced by rbln-smi. Existing scripts using rbln-stat may still work, but new users should use rbln-smi.
Quick Start¶
Expected output (example)
+-------------------------------------------------------------------------------------------------+
| Device Information KMD ver: N/A |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 0 | RBLN-CA12 | rbln0 | 0000:51:00.0 | 38C | 43.9W | P2 | 2.4GB / 15.7GiB | 98.7 |
| 1 | RBLN-CA12 | rbln1 | 0000:d8:00.0 | 25C | 6.1W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------+
| Context Information |
+-----+---------------------+--------------+-----+----------+------+---------------------+--------+
| NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+-----+---------------------+--------------+-----+----------+------+---------------------+--------+
| 0 | python3 | 2928727 | 1 | min | 0 | 1.9GiB | run |
| 0 | python3 | 2930166 | 2 | min | 1 | 468.0MiB | idle |
| 0 | python3 | 2934705 | 3 | min | 2 | 88.0MiB | idle |
+-----+---------------------+--------------+-----+----------+------+---------------------+--------+
Key Concepts and Terminology¶
Device selection¶
- Use
-d, --device <ids>to target specific NPUs (comma-separated list or range). - Output refers to device labels (the
Devicecolumn) such asrbln0,rbln1.
Output formats¶
- Table (default): human-readable summary for devices and contexts.
- JSON (
-j): machine-readable output. - Query (
-q): space-separated (CSV-like) output suitable for scripts.
Common columns and performance state¶
The default monitor output typically includes:
| Column | Meaning |
|---|---|
Name |
NPU product name (for example: RBLN-CA25). |
Power |
Power consumption. |
Perf |
Performance state (P-state). |
Temp |
Temperature. |
Util |
Utilization. |
PID |
Process ID. |
CTX |
Context ID. |
Memalloc |
Allocated memory. |
For the authoritative product/family mapping and supported card list, see Support Matrix.
The Perf column reports a P-state. A common interpretation is:
| P-state | Clock (Neural Engine) | PCIe | Note |
|---|---|---|---|
P2 |
Nominal | Gen5 | - |
P4 |
Nominal | Gen4 | - |
P6 |
Half | Gen4 | - |
P10 |
Half | (No update) | Thermal Throttling |
P12 |
Minimal | (No update) | System Abort (Hang) |
P14 |
Off | (No update) | Idle |
Command Reference¶
Note
Some operations require sudo. In particular, subcommands such as group, tdr, timeout, and sort require sudo.
General usage¶
The basic invocation pattern is as follows:
Global options¶
The following options are available across all command modes:
| Option | Description |
|---|---|
-h, --help |
Display help information. Subcommand help requires sudo. |
-b, --byte-format |
Display values in raw units instead of human-readable units. |
-j, --json |
Render the result as JSON. |
-q, --query |
Print data in a space-separated (CSV-like) format. |
-qd, --query-device <columns> |
Select specific device columns when using query mode. |
-qc, --query-context <columns> |
Select specific context columns when using query mode. |
-t, --topo |
Show device/system topology (kernel 6.2 or later recommended). |
-L, --list |
List NPUs and their UUIDs. |
-d, --device <ids> |
Choose NPUs by comma-separated list or range. |
-g, --group |
Display output organized by RSD groups. |
-v, --version |
Print version information and exit. |
CLI Examples¶
Basic commands¶
Summary
Default view. Shows a snapshot of device and context information.
Command
Output (excerpt)
Mon Nov 17 14:15:26 2025
+-------------------------------------------------------------------------------------------------+
| Device Information KMD ver: 2.1.0~dev.107+gafec0b9 |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | RBLN-CA25 | rbln0 | 0000:47:00.0 | 37C | 75.9W | P2 | 90.0MiB / 15.7GiB | 50.0 |
| 1 | | rbln1 | 0000:48:00.0 | 26C | | P14 | 0.0B / 15.7GiB | 0.0 |
+-------------------------------------------------------------------------------------------------+
| Context Information |
+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+=====+=====================+==============+===========+==========+======+===============+========+
| 0 | command_submission | 257082 | 10001 | min | 0 | 90.0MiB | run |
Summary
Prints the result in JSON format.
Command
Output (excerpt)
Summary
Prints the result in a space-separated (CSV-like) format.
Command
Output (excerpt)
Summary
Restricts output to the specified NPUs only.
Command
Output (excerpt)
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | RBLN-CA25 | rbln0 | 0000:05:00.0 | 24C | 41.6W | P14 | 0.0B / 15.7GiB | 0.0 |
| 1 | | rbln1 | 0000:06:00.0 | 25C | | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
Summary
Prints device/system topology information (distance matrix and related details).
Command
Output (excerpt)
Subcommands (sudo)¶
Summary
Create or destroy RSD groups.
Command
Options
| Argument | Description |
|---|---|
-c, --create <group_id> |
Create a new RSD group (use with -a). Specifying all assigns one group per device. |
-a, --attach <npu_ids> |
Attach NPUs (comma-separated) to the new group. |
-d, --destroy <group_id> |
Remove an RSD group. Specifying all removes all groups and merges them into the default group 0. |
Example session
$ sudo rbln-smi group -h
usage: rbln-smi group [-h] [-c GROUP_ID -a NPU_ID[,NPU_ID...]] [-d GROUP_ID]
$ sudo rbln-smi group -c 1 -a 0,1
# (no output on success)
$ rbln-smi --group
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 1 | 0 | RBLN-CA25 | rbln0 | 0000:05:00.0 | 23C | 41.0W | P14 | 0.0B / 15.7GiB | 0.0 |
| 1 | 1 | | rbln1 | 0000:06:00.0 | 25C | | P14 | 0.0B / 15.7GiB | 0.0 |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
$ sudo rbln-smi group -d 1
# (no output on success)
$ sudo rbln-smi group -c all
# Assigns one group per device
$ sudo rbln-smi group -d all
# Removes all groups and merges them into the default group 0
Summary
Set the TDR value for an RSD group.
Command
Notes
On success, this command typically prints no output.
Output (example)
Summary
Adjust timeout values for an RSD group.
Command
Notes
On success, this command typically prints no output.
Output (example)
Troubleshooting¶
Permission denied / subcommand requires sudo¶
Run the command with sudo (for example: sudo rbln-smi group -h).
No devices are shown¶
- Confirm the driver is installed and NPUs are detected.
- Try
rbln-smi -Lto list devices.
Topology output is missing¶
If --topo is not available or incomplete, check kernel/version requirements and try again with specific devices via --device <ids>.
See also¶
rblnBandwidthLatencyTest: host-to-NPU and NPU-to-NPU performance benchmarkrblnvs: system validationrbln-flash: firmware update tool