Management Tools¶
The RBLN SDK provides a set of command-line tools for managing, monitoring, and maintaining RBLN NPUs. These tools enable system administrators, developers, and operators to manage RBLN hardware reliably in production environments.
Available Tools¶
The RBLN management toolset consists of the following utilities:
| Tool | Purpose | Primary Use Cases |
|---|---|---|
| rbln-smi | System Management Interface | Device monitoring, resource tracking, performance management, process inspection |
| rblnBandwidthLatencyTest | Performance Testing | Bandwidth measurement, latency testing, topology validation, system benchmarking |
| rbln-bios | Validation Suite | GRUB/BIOS/fan speed verification |
| rbln-flash | Firmware Update Utility | CP/MCU firmware updates |
| rbln-vs | Hardware Diagnostics | Software integrity, hardware presence, PCIe bandwidth, NPU memory validation |
| RSMD | System Management Daemon | Background device monitoring, gRPC API service, event logging |
Tool Categories¶
Monitoring and Management¶
rbln-smi is the primary tool for real-time monitoring and management of RBLN NPUs. It provides device information (hardware, PCI topology), performance metrics (power, temperature, utilization), process tracking, and resource management through RSD groups. Supports multiple output formats including human-readable tables, JSON, and CSV.
Performance and Validation¶
rblnBandwidthLatencyTest measures data transfer performance between host and device memory, including bandwidth (H2D, D2D, D2H) and latency measurements. Useful for system validation, performance benchmarking, and topology verification.
rbln-bios verifies system configuration including GRUB parameters, BIOS settings (IOMMU, SR-IOV, PCIe, NUMA), and fan speed.
Limitations
BIOS and fan speed validation require BMC access via the Redfish API and are currently supported on Supermicro servers with a DCMS license.
rbln-vs is a hardware and software diagnostic tool that validates NPU health through a tiered test suite (L1/L2). L1 tests check software integrity and hardware presence; L2 tests stress PCIe bandwidth, NPU memory integrity, and memory bandwidth stability.
Firmware Update¶
rbln-flash manages CP and MCU firmware updates for RBLN NPUs. Supports parallel updates across multiple RBLN NPUs with per-device status tracking. The RBLN driver must be unloaded before the operation.
System Services¶
RSMD (Rebellions System Management Daemon) is a background service that provides centralized device management through a gRPC API. The daemon monitors kernel events via netlink, collects device telemetry (temperature, power, memory, utilization), and maintains event history as CSV logs.
RSMD includes:
- the
rbln-smdiCLI tool for interactive device management - the
rbln_daemonsystemd service for automatic startup