NPU Allocation
RBLN NPUs can be utilized inside Docker containers, enabling controlled deployment and resource management. On servers with multiple NPUs, containers can be configured to use specific NPUs, enabling efficient resource allocation.
Rebellions Scalable Design (RSD)
Rebellions Scalable Design (RSD) is a technology that enables efficient management and utilization of multiple RBLN NPUs. RSD provides:
- Device Grouping: NPUs are organized into groups for better resource management
- Tensor Parallelism: Large language models can be distributed across multiple NPUs within the same group
- Fault Isolation: Separate groups prevent issues in one group from impacting others
- Independent Scheduling: Each group can be managed independently for optimal performance
All RBLN NPUs operate through RSD, whether used individually or in multi-NPU configurations. This means that even a single NPU container benefits from RSD's management capabilities.
Note
The rbln-smi tool for managing and monitoring RBLN NPUs is included in the RBLN driver package. When using the RBLN Container Toolkit, rbln-smi is automatically available inside containers without manual mounting.
RBLN Container Toolkit simplifies container access to RBLN NPUs using the Container Device Interface (CDI) specification.
For installation, CLI reference, Kubernetes deployment, and troubleshooting, see the RBLN Container Toolkit Guide.
Quick Setup
After installing the toolkit, configure and run:
| # Configure
$ sudo rbln-ctk cdi generate
$ sudo rbln-ctk runtime configure
$ sudo systemctl restart docker
# Run container with NPU access
$ docker run \
--device rebellions.ai/npu=all \
-it IMAGE_NAME:TAG
|
With the toolkit configured, all containers using --device rebellions.ai/npu=all will have RBLN libraries and tools (including rbln-smi) automatically available.
Container Creation Guidelines
This section summarizes common requirements and options for using RBLN NPUs with Docker containers.
RSD Group Management
Check RSD Group Configuration
| $ rbln-smi -g
+-------------------------------------------------------------------------------------------------------+
| RSD Management Group KMD ver: 2.0.1 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | 0 | RBLN-CA22 | rbln0 | 0000:1b:00.0 | 37C | 19.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 1 | RBLN-CA22 | rbln1 | 0000:1c:00.0 | 36C | 19.2W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
| RSD Context Information |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
|
By default, all NPUs are shown under group 0 (as indicated in the Grp column). For optimal container isolation and performance, create separate RSD groups for each container.
Create an RSD Group
Create dedicated RSD groups for your containers:
| $ sudo rbln-smi group -c <group_id> -a <device_id_list>
|
Delete an RSD Group
You can destroy a group you created with the following command:
| $ sudo rbln-smi group -d <group_id>
|
Regenerate the CDI spec after group changes
Both rbln-smi group -c and rbln-smi group -d change the host's /dev/rsd* topology, which the CDI spec captured at the last rbln-ctk cdi generate run. Rerun sudo rbln-ctk cdi generate after the change if any container will use the rebellions.ai/npu=N handle; the toolkit's automatic refresh paths react only to driver and library changes, not to rbln-smi group operations.
Accessing NPUs in Containers
Containers access NPUs through the --device rebellions.ai/npu=<handle> flag. For the available handles (=all, =N), single- and multi-NPU patterns, and runtime-specific behavior, see Device Selection.
Allocate a Single RBLN NPU
This section demonstrates how to configure a Docker container to use a single RBLN NPU with proper RSD group isolation.
Important
Before proceeding, read the Container Creation Guidelines above to understand the common requirements and options referenced in this step-by-step guide.
Step 1: Create an RSD Group
Create a separate RSD group for your single NPU container:
| $ sudo rbln-smi group -c 1 -a 1
|
This command creates group 1 and assigns rbln1 to it, removing it from the default group (group 0).
Step 2: Verify RSD Group Configuration
Verify that the NPU is now in its dedicated group:
Your output should show the NPU in its own group:
| +-------------------------------------------------------------------------------------------------------+
| RSD Management Group KMD ver: 2.0.1 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | 0 | RBLN-CA22 | rbln0 | 0000:1b:00.0 | 35C | 18.4W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 1 | 0 | RBLN-CA22 | rbln1 | 0000:1c:00.0 | 34C | 18.2W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
| RSD Context Information |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
|
Notice that rbln1 is now in group 1, separate from rbln0 in group 0.
Step 3: Run Docker Container with Single NPU
Run a Docker container with access to the dedicated NPU:
| $ docker run \
--device rebellions.ai/npu=1 \
-it IMAGE_NAME:TAG
|
The container now has exclusive access to rbln1 through RSD group 1, ensuring optimal performance and isolation.
Allocate Multiple RBLN NPUs
This section demonstrates how to configure Docker containers to use multiple RBLN NPUs. Multiple NPUs can be used for large language models that leverage tensor parallelism.
A list of models that support RSD can be found in Optimum RBLN Multi-NPU Supported Models.
Container Allocation Strategy
You can create multiple containers, each with a dedicated NPU group. For example, with 8 RBLN NPUs, you can create 2 containers, with 4 NPUs assigned to separate RSD groups.
Important
Before proceeding, read the Container Creation Guidelines above to understand the common requirements and options referenced in this step-by-step guide.
Note
A single RBLN-CA25 NPU card is comprised of 4 devices. Check with rbln-smi and allocate all 4 devices to the same container.
Step 1: Create an RSD Group
Ensure all NPUs that will work together in a single container are in the same RSD group.
For example, if you have 8 devices and want to create a container with 4 NPUs:
| $ sudo rbln-smi group -c 1 -a 4,5,6,7
|
Step 2: Verify Group Configuration
Verify that the NPUs are properly grouped:
Your output should show the NPUs organized in separate groups:
| +-------------------------------------------------------------------------------------------------------+
| RSD Management Group KMD ver: 2.0.1 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | 0 | RBLN-CA22 | rbln0 | 0000:1b:00.0 | 38C | 19.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 1 | RBLN-CA22 | rbln1 | 0000:1c:00.0 | 37C | 19.3W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 2 | RBLN-CA22 | rbln2 | 0000:1f:00.0 | 36C | 18.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 3 | RBLN-CA22 | rbln3 | 0000:22:00.0 | 39C | 19.6W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 1 | 0 | RBLN-CA22 | rbln4 | 0000:41:00.0 | 36C | 17.9W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 1 | RBLN-CA22 | rbln5 | 0000:42:00.0 | 36C | 21.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 2 | RBLN-CA22 | rbln6 | 0000:45:00.0 | 40C | 20.6W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 3 | RBLN-CA22 | rbln7 | 0000:48:00.0 | 36C | 17.9W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
| RSD Context Information |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
|
Notice that rbln4~rbln7 are now in group 1, separate from rbln0~rbln3 in group 0.
Step 3: Run Docker Containers with Multiple NPUs
Container 1: Assign rbln0~rbln3 (Group 0)
| $ docker run \
--device rebellions.ai/npu=0 \
--device rebellions.ai/npu=1 \
--device rebellions.ai/npu=2 \
--device rebellions.ai/npu=3 \
-it IMAGE_NAME:TAG
|
Container 2: Assign rbln4~rbln7 (Group 1)
| $ docker run \
--device rebellions.ai/npu=4 \
--device rebellions.ai/npu=5 \
--device rebellions.ai/npu=6 \
--device rebellions.ai/npu=7 \
-it IMAGE_NAME:TAG
|
In this setup, each Docker container can utilize 4 RBLN NPUs:
- Container 1: Uses rbln0~rbln3 through RSD group 0 (
/dev/rsd0)
- Container 2: Uses rbln4~rbln7 through RSD group 1 (
/dev/rsd1)