Model APIs Overview¶
The Optimum RBLN library provides APIs for using HuggingFace models on RBLN NPU. This overview focuses on the API design and usage patterns.
API Design¶
Naming Convention¶
Optimum RBLN follows a simple naming pattern:
-
Model Classes: Add "RBLN" prefix to original HuggingFace class names
-
Configuration Classes: Add "Config" suffix to the RBLN model class
Static Graph Compilation¶
RBLN SDK works with static computational graphs, which means certain parameters must be specified at compile time. These include:
- Batch size
- Sequence length (for language models)
- Tensor parallel size (for distributed execution)
- Other model-specific parameters
The configuration classes (RBLNModelNameConfig
) allow you to specify these static values.
Supported Methods¶
Optimum RBLN preserves the same API interface as the original HuggingFace models:
- Language models support
.generate()
method - Vision and other models support both direct call syntax (
model(inputs)
) and.forward()
method - Diffusers pipelines support the
__call__()
method (used with direct call syntax)
Usage Patterns¶
There are multiple ways to configure and use RBLN models:
Method 1: Using rbln_* parameters¶
Method 2: Using RBLNConfig object (recommended)¶
Benefits¶
- Drop-in Replacement: Replace original HuggingFace imports with RBLN equivalents
- Same Familiar API: Use the same methods you're already familiar with
- Fine-grained Control: Configure static parameters for maximum performance
For detailed model support information and hardware compatibility, please refer to the Optimum RBLN Overview.