NPU Allocation
RBLN NPUs can be utilized inside Docker containers, enabling controlled deployment and resource management. On servers with multiple NPUs, containers can be configured to use specific NPUs, enabling efficient resource allocation.
Rebellions Scalable Design (RSD)
Rebellions Scalable Design (RSD) is a technology that enables efficient management and utilization of multiple RBLN NPUs. RSD provides:
- Device Grouping: NPUs are organized into groups for better resource management
- Tensor Parallelism: Large language models can be distributed across multiple NPUs within the same group
- Fault Isolation: Separate groups prevent issues in one group from impacting others
- Independent Scheduling: Each group can be managed independently for optimal performance
All RBLN NPUs operate through RSD, whether used individually or in multi-NPU configurations. This means that even a single NPU container benefits from RSD's management capabilities.
Note
The rbln-smi tool for managing and monitoring RBLN NPUs is included in the RBLN driver package. When using the RBLN Container Toolkit, rbln-smi is automatically available inside containers without manual mounting.
RBLN Container Toolkit simplifies container access to RBLN NPUs using the Container Device Interface (CDI) specification.
For installation, CLI reference, Kubernetes deployment, and troubleshooting, see the RBLN Container Toolkit Guide.
Quick Setup
After installing the toolkit, configure and run:
| # Configure
$ sudo rbln-ctk cdi generate
$ sudo rbln-ctk runtime configure
$ sudo systemctl restart docker
# Run container with NPU access
$ docker run --device rebellions.ai/npu=runtime -it IMAGE_NAME:TAG
|
With the toolkit configured, all containers using --device rebellions.ai/npu=runtime will have RBLN libraries and tools (including rbln-smi) automatically available.
Container Creation Guidelines
This section summarizes common requirements and options for using RBLN NPUs with Docker containers.
RSD Group Management
Check RSD Group Configuration
| $ rbln-smi -g
+-------------------------------------------------------------------------------------------------------+
| RSD Management Group KMD ver: 2.0.1 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | 0 | RBLN-CA22 | rbln0 | 0000:1b:00.0 | 37C | 19.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 1 | RBLN-CA22 | rbln1 | 0000:1c:00.0 | 36C | 19.2W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
| RSD Context Information |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
|
By default, all NPUs are shown under group 0 (as indicated in the Grp column). For optimal container isolation and performance, create separate RSD groups for each container.
Create an RSD Group
Create dedicated RSD groups for your containers:
| $ sudo rbln-smi group -c <group_id> -a <device_id_list>
|
Delete an RSD Group
You can destroy a group you created with the following command:
| $ sudo rbln-smi group -d <group_id>
|
Accessing NPUs in Containers
With the RBLN Container Toolkit installed and configured, you can run containers with full NPU access:
| docker run --device rebellions.ai/npu=runtime -ti IMAGE_NAME:TAG
|
This single flag provides:
- Access to all RBLN NPU devices
- RBLN libraries automatically mounted
rbln-smi and other tools available
To allocate specific NPUs to containers using RSD groups, you need to explicitly specify devices:
| docker run --device /dev/rsdX:/dev/rsd0 \
--device /dev/rblnX:/dev/rbln0 \
--device rebellions.ai/npu=runtime \
-ti IMAGE_NAME:TAG
|
The purpose of each option is as follows:
--device /dev/rsdX:/dev/rsd0
- Exposes the RSD group interface to the container.
--device /dev/rblnX:/dev/rbln0
- Exposes individual RBLN NPU devices to the container.
--device rebellions.ai/npu=runtime
- Mounts RBLN libraries and tools via CDI.
Allocate a Single RBLN NPU
This section demonstrates how to configure a Docker container to use a single RBLN NPU with proper RSD group isolation.
Important
Before proceeding, read the Container Creation Guidelines above to understand the common requirements and options referenced in this step-by-step guide.
Step 1: Create an RSD Group
Create a separate RSD group for your single NPU container:
| $ sudo rbln-smi group -c 1 -a 1
|
This command creates group 1 and assigns rbln1 to it, removing it from the default group (group 0).
Step 2: Verify RSD Group Configuration
Verify that the NPU is now in its dedicated group:
Your output should show the NPU in its own group:
| +-------------------------------------------------------------------------------------------------------+
| RSD Management Group KMD ver: 2.0.1 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | 0 | RBLN-CA22 | rbln0 | 0000:1b:00.0 | 35C | 18.4W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 1 | 0 | RBLN-CA22 | rbln1 | 0000:1c:00.0 | 34C | 18.2W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
| RSD Context Information |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
|
Notice that rbln1 is now in group 1, separate from rbln0 in group 0.
Step 3: Run Docker Container with Single NPU
Run a Docker container with access to the dedicated NPU:
| $ docker run --device /dev/rsd1:/dev/rsd0 \
--device /dev/rbln1:/dev/rbln0 \
--device rebellions.ai/npu=runtime \
-ti IMAGE_NAME:TAG
|
The container now has exclusive access to rbln1 through RSD group 1, ensuring optimal performance and isolation.
Allocate Multiple RBLN NPUs
This section demonstrates how to configure Docker containers to use multiple RBLN NPUs. Multiple NPUs can be used for large language models that leverage tensor parallelism.
A list of models that support RSD can be found in Optimum RBLN Multi-NPU Supported Models.
Container Allocation Strategy
You can create multiple containers, each with a dedicated NPU group. For example, with 8 RBLN NPUs, you can create 2 containers, with 4 NPUs assigned to separate RSD groups.
Important
Before proceeding, read the Container Creation Guidelines above to understand the common requirements and options referenced in this step-by-step guide.
Note
A single RBLN-CA25 NPU card is comprised of 4 devices. Check with rbln-smi and allocate all 4 devices to the same container.
Step 1: Create an RSD Group
Ensure all NPUs that will work together in a single container are in the same RSD group.
For example, if you have 8 devices and want to create a container with 4 NPUs:
| $ sudo rbln-smi group -c 1 -a 4,5,6,7
|
Step 2: Verify Group Configuration
Verify that the NPUs are properly grouped:
Your output should show the NPUs organized in separate groups:
| +-------------------------------------------------------------------------------------------------------+
| RSD Management Group KMD ver: 2.0.1 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU | Name | Device | PCI BUS ID | Temp | Power | Perf | Memory(used/total) | Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0 | 0 | RBLN-CA22 | rbln0 | 0000:1b:00.0 | 38C | 19.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 1 | RBLN-CA22 | rbln1 | 0000:1c:00.0 | 37C | 19.3W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 2 | RBLN-CA22 | rbln2 | 0000:1f:00.0 | 36C | 18.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 3 | RBLN-CA22 | rbln3 | 0000:22:00.0 | 39C | 19.6W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 1 | 0 | RBLN-CA22 | rbln4 | 0000:41:00.0 | 36C | 17.9W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 1 | RBLN-CA22 | rbln5 | 0000:42:00.0 | 36C | 21.5W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 2 | RBLN-CA22 | rbln6 | 0000:45:00.0 | 40C | 20.6W | P14 | 0.0B / 15.7GiB | 0.0 |
| | 3 | RBLN-CA22 | rbln7 | 0000:48:00.0 | 36C | 17.9W | P14 | 0.0B / 15.7GiB | 0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
| RSD Context Information |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process | PID | CTX | Priority | PTID | Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
|
Notice that rbln4~rbln7 are now in group 1, separate from rbln0~rbln3 in group 0.
Step 3: Run Docker Containers with Multiple NPUs
Container 1: Assign rbln0~rbln3 (Group 0)
| $ docker run \
--device /dev/rsd0 \
--device /dev/rbln0 --device /dev/rbln1 \
--device /dev/rbln2 --device /dev/rbln3 \
--device rebellions.ai/npu=runtime \
-ti IMAGE_NAME:TAG
|
Container 2: Assign rbln4~rbln7 (Group 1)
| $ docker run \
--device /dev/rsd1:/dev/rsd0 \
--device /dev/rbln4:/dev/rbln0 --device /dev/rbln5:/dev/rbln1 \
--device /dev/rbln6:/dev/rbln2 --device /dev/rbln7:/dev/rbln3 \
--device rebellions.ai/npu=runtime \
-ti IMAGE_NAME:TAG
|
In this setup, each Docker container can utilize 4 RBLN NPUs:
- Container 1: Uses rbln0~rbln3 through RSD group 0 (
/dev/rsd0)
- Container 2: Uses rbln4~rbln7 through RSD group 1 (
/dev/rsd1)