Skip to content

NPU Allocation

The RBLN NPUs can be utilized within a Docker container for streamlined deployment and management. If a server has multiple RBLN NPUs, each container can be configured to use the specific NPU, enabling efficient resource allocation in the multi-NPU environment.

Rebellions Scalable Design (RSD)

Rebellions Scalable Design (RSD) is a technology that enables efficient management and utilization of multiple RBLN NPUs. RSD provides:

  • Device Grouping: NPUs are organized into groups for better resource management
  • Tensor Parallelism: Large language models can be distributed across multiple NPUs within the same group
  • Fault Isolation: Separate groups ensure that issues in one group don't affect others
  • Independent Scheduling: Each group can be managed independently for optimal performance

All RBLN NPUs, whether used individually or in multi-NPU configurations, operate through the RSD. This means that even single NPU containers benefit from RSD's management capabilities.

Note

The rbln-smi tool for managing and monitoring RBLN NPUs is included in the RBLN driver package. When using the RBLN Container Toolkit, rbln-smi is automatically available inside containers without manual mounting.

RBLN Container Toolkit

RBLN Container Toolkit simplifies container access to RBLN NPUs using the Container Device Interface (CDI) specification.

For installation, CLI reference, Kubernetes deployment, and troubleshooting, see the RBLN Container Toolkit Guide.

Quick Setup

After installing the toolkit, configure and run:

1
2
3
4
5
6
7
# Configure
$ sudo rbln-ctk cdi generate
$ sudo rbln-ctk runtime configure
$ sudo systemctl restart docker

# Run container with NPU access
$ docker run --device rebellions.ai/npu=runtime -it IMAGE_NAME:TAG

With the toolkit configured, all containers using --device rebellions.ai/npu=runtime will have RBLN libraries and tools (including rbln-smi) automatically available.

Container Creation Guidelines

This section summarizes common requirements and options when using RBLN NPUs with Docker.

RSD Group Management

Check RSD Group Configuration

$ rbln-smi -g
+-------------------------------------------------------------------------------------------------------+
|                                  RSD Management Group KMD ver: 2.0.1                                  |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU |    Name   | Device  |   PCI BUS ID  | Temp |  Power  | Perf |  Memory(used/total) |  Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0   | 0   | RBLN-CA22 | rbln0   |  0000:1b:00.0 |  37C |  19.5W  | P14  |    0.0B / 15.7GiB   |   0.0 |
|     | 1   | RBLN-CA22 | rbln1   |  0000:1c:00.0 |  36C |  19.2W  | P14  |    0.0B / 15.7GiB   |   0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
|                                        RSD Context Information                                        |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process             |     PID      |    CTX    | Priority | PTID |      Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A                 |     N/A      |    N/A    |   N/A    | N/A  |           N/A |  N/A   |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+

By default, all NPUs are shown under group 0 (as indicated in the Grp column). For optimal container isolation and performance, you should create separate groups for each container.

Create a RSD Group

Create dedicated RSD groups for your containers:

$ sudo rbln-smi group -c <group_id> -a <device_id_list>

Delete a RSD Group

You can destroy a group you created with the following command:

$ sudo rbln-smi group -d <group_id>

Accessing NPUs in Containers

With the RBLN Container Toolkit installed and configured, you can run containers with full NPU access:

docker run --device rebellions.ai/npu=runtime -ti IMAGE_NAME:TAG

This single flag provides:

  • Access to all RBLN NPU devices
  • RBLN libraries automatically mounted
  • rbln-smi and other tools available

For allocating specific NPUs to containers (using RSD groups), you need to specify devices explicitly:

1
2
3
4
docker run --device /dev/rsdX:/dev/rsd0 \
  --device /dev/rblnX:/dev/rbln0 \
  --device rebellions.ai/npu=runtime \
  -ti IMAGE_NAME:TAG

The purpose of each option is as follows:

  • --device /dev/rsdX:/dev/rsd0
    • This exposes the RSD group interface to the container.
  • --device /dev/rblnX:/dev/rbln0
    • This exposes individual RBLN NPU devices to the container.
  • --device rebellions.ai/npu=runtime
    • This mounts RBLN libraries and tools via CDI.

Allocate a Single RBLN NPU

This section demonstrates how to configure a Docker container to use a single RBLN NPU with proper RSD group isolation.

Important

Before proceeding, read the Container Creation Guidelines above to understand the common requirements and options referenced in this step-by-step guide.

Step 1: Create a RSD Group

Create a separate RSD group for your single NPU container:

$ sudo rbln-smi group -c 1 -a 1

This command creates group 1 and assigns rbln1 to it, removing it from the default group 0.

Step 2: Verify RSD Group Configuration

Verify that the NPU is now in its dedicated group:

$ rbln-smi -g

Your output should show the NPU in its own group:

+-------------------------------------------------------------------------------------------------------+
|                                  RSD Management Group KMD ver: 2.0.1                                  |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU |    Name   | Device  |   PCI BUS ID  | Temp |  Power  | Perf |  Memory(used/total) |  Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0   | 0   | RBLN-CA22 | rbln0   |  0000:1b:00.0 |  35C |  18.4W  | P14  |    0.0B / 15.7GiB   |   0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 1   | 0   | RBLN-CA22 | rbln1   |  0000:1c:00.0 |  34C |  18.2W  | P14  |    0.0B / 15.7GiB   |   0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
|                                        RSD Context Information                                        |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process             |     PID      |    CTX    | Priority | PTID |      Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A                 |     N/A      |    N/A    |   N/A    | N/A  |           N/A |  N/A   |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+

Notice that rbln1 is now in group 1, separate from rbln0 in group 0.

Step 3: Run Docker Container with Single NPU

Run a Docker container with access to the dedicated NPU:

1
2
3
4
$ docker run --device /dev/rsd1:/dev/rsd0 \
    --device /dev/rbln1:/dev/rbln0 \
    --device rebellions.ai/npu=runtime \
    -ti IMAGE_NAME:TAG

The container now has exclusive access to rbln1 through RSD group 1, ensuring optimal performance and isolation.

Allocate Multiple RBLN NPUs

This section demonstrates how to configure Docker containers to use multiple RBLN NPUs. Multiple NPUs can be used for large language models that leverage tensor parallelism.

A list of models that support RSD can be found in Optimum RBLN Multi-NPU Supported Models.

Container Allocation Strategy

You can create multiple containers, each with a dedicated NPU group. For example, with 8 RBLN NPUs, you can create 2 containers, with 4 NPUs assigned to separate RSD groups.

Important

Before proceeding, read the Container Creation Guidelines above to understand the common requirements and options referenced in this step-by-step guide.

Note

A single RBLN-CA25 NPU card is comprised of 4 devices. Check with rbln-smi and allocate all 4 devices to the same container.

Step 1: Create a RSD Group

Ensure all NPUs that will work together in a single container are in the same RSD group.

For example, if you have 8 devices and want to create a container with 4 NPUs:

$ sudo rbln-smi group -c 1 -a 4,5,6,7

Step 2: Verify Group Configuration

Verify that the NPUs are properly grouped:

$ rbln-smi -g

Your output should show the NPUs organized in separate groups:

+-------------------------------------------------------------------------------------------------------+
|                                  RSD Management Group KMD ver: 2.0.1                                  |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| Grp | NPU |    Name   | Device  |   PCI BUS ID  | Temp |  Power  | Perf |  Memory(used/total) |  Util |
+=====+=====+===========+=========+===============+======+=========+======+=====================+=======+
| 0   | 0   | RBLN-CA22 | rbln0   |  0000:1b:00.0 |  38C |  19.5W  | P14  |    0.0B / 15.7GiB   |   0.0 |
|     | 1   | RBLN-CA22 | rbln1   |  0000:1c:00.0 |  37C |  19.3W  | P14  |    0.0B / 15.7GiB   |   0.0 |
|     | 2   | RBLN-CA22 | rbln2   |  0000:1f:00.0 |  36C |  18.5W  | P14  |    0.0B / 15.7GiB   |   0.0 |
|     | 3   | RBLN-CA22 | rbln3   |  0000:22:00.0 |  39C |  19.6W  | P14  |    0.0B / 15.7GiB   |   0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
| 1   | 0   | RBLN-CA22 | rbln4   |  0000:41:00.0 |  36C |  17.9W  | P14  |    0.0B / 15.7GiB   |   0.0 |
|     | 1   | RBLN-CA22 | rbln5   |  0000:42:00.0 |  36C |  21.5W  | P14  |    0.0B / 15.7GiB   |   0.0 |
|     | 2   | RBLN-CA22 | rbln6   |  0000:45:00.0 |  40C |  20.6W  | P14  |    0.0B / 15.7GiB   |   0.0 |
|     | 3   | RBLN-CA22 | rbln7   |  0000:48:00.0 |  36C |  17.9W  | P14  |    0.0B / 15.7GiB   |   0.0 |
+-----+-----+-----------+---------+---------------+------+---------+------+---------------------+-------+
+-------------------------------------------------------------------------------------------------------+
|                                        RSD Context Information                                        |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+
| Grp | NPU | Process             |     PID      |    CTX    | Priority | PTID |      Memalloc | Status |
+=====+=====+=====================+==============+===========+==========+======+===============+========+
| N/A | N/A | N/A                 |     N/A      |    N/A    |   N/A    | N/A  |           N/A |  N/A   |
+-----+-----+---------------------+--------------+-----------+----------+------+---------------+--------+

Notice that rbln4~rbln7 are now in group 1, separate from rbln0~rbln3 in group 0.

Step 3: Run Docker Containers with Multiple NPUs

Container 1: Assign rbln0~rbln3 (Group 0)

1
2
3
4
5
6
$ docker run \
    --device /dev/rsd0 \
    --device /dev/rbln0 --device /dev/rbln1 \
    --device /dev/rbln2 --device /dev/rbln3 \
    --device rebellions.ai/npu=runtime \
    -ti IMAGE_NAME:TAG

Container 2: Assign rbln4~rbln7 (Group 1)

1
2
3
4
5
6
$ docker run \
    --device /dev/rsd1:/dev/rsd0 \
    --device /dev/rbln4:/dev/rbln0 --device /dev/rbln5:/dev/rbln1 \
    --device /dev/rbln6:/dev/rbln2 --device /dev/rbln7:/dev/rbln3 \
    --device rebellions.ai/npu=runtime \
    -ti IMAGE_NAME:TAG

In this setup, each Docker container can utilize 4 RBLN NPUs:

  • Container 1: Uses rbln0~rbln3 through RSD group 0 (/dev/rsd0)
  • Container 2: Uses rbln4~rbln7 through RSD group 1 (/dev/rsd1)