Docker Support¶
The RBLN NPUs can be utilized within a Docker container for streamlined deployment and management. If a server has multiple RBLN NPUs, each container can be configured to use the specific NPU, enabling efficient resource allocation in the multi-NPU environment.
Allocate a single RBLN NPU¶
When creating a Docker container, you can mount the device file for each RBLN NPU to enable its use.
Allocate multiple RBLN NPUs¶
When using multiple RBLN NPUs, you can mount as many as needed while creating the Docker container. For example, if you have 8 RBLN NPUs, you can create 2 containers, each mounting 4 RBLN NPUs, as belows:
In this setup, each Docker container can utilize 4 RBLN NPUs (/dev/rbln0
, /dev/rbln1
, /dev/rbln2
, /dev/rbln3
).
Allocate multiple RBLN NPUs for RSD support¶
Through Rebellions Scalable Design (RSD), large language models can be effectively tensor-parallelized and run across multiple NPUs. A list of models that support RSD can be found in Optimum RBLN Multi-NPU Supported Models.
When using RSD, the container execution command must include the additional mounting of the /dev/rsd0
file as shown below:
Multi-Container Configuration Utilizing RSD¶
The /dev/rsd0
must be mounted in all Docker containers utilizing RSD.
For instance, on a server with 8 RBLN NPUs, if you create two Docker containers, each utilizing the RSD with 4 RBLN NPUs, both the first and the second Docker containers should be created with mounting /dev/rsd0
as demonstrated below: