Skip to content

Optimum RBLN

Optimum RBLN serves as a bridge connecting the HuggingFace transformers/diffusers libraries to RBLN NPUs, i.e. ATOM (RBLN-CA02) and ATOM+ (RBLN-CA12). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.

Transformers

Single NPU

| Model | Model Architecture | Task | | :---: | :-----------------------------: | :-----: | | Phi-2 | PhiForCausalLM | | | Gemma-2b | GemmaForCausalLM | | | GPT2 | GPT2LMHeadModel | | | GPT2-medium | GPT2LMHeadModel | | | GPT2-large | GPT2LMHeadModel | | | GPT2-xl | GPT2LMHeadModel | | | T5-small | T5ForConditionalGeneration | | | T5-base | T5ForConditionalGeneration | | | T5-large | T5ForConditionalGeneration | | | T5-3b | T5ForConditionalGeneration | | | BART-base | BartForConditionalGeneration | | | BART-large | BartForConditionalGeneration | | | KoBART-base | BartForConditionalGeneration | | | E5-base-4K | BertModel | | | LaBSE | BertModel | | | KR-SBERT-V40K-klueNLI-augSTS | BertModel | | | BERT-base | - BertForMaskedLM
- BertForQuestionAnswering | | | BERT-large | - BertForMaskedLM
- BertForQuestionAnswering | | | DistilBERT-base | DistilBertForQuestionAnswering | | | SecureBERT | RobertaForMaskedLM | | | RoBERTa | RobertaForSequenceClassification | | | BGE-M3 | XLMRobertaModel | | | BGE-Reranker-V2-M3 | XLMRobertaForSequenceClassification | | | BGE-Reranker-Base | XLMRobertaForSequenceClassification | | | BGE-Reranker-Large | XLMRobertaForSequenceClassification | | | Ko-Reranker | XLMRobertaForSequenceClassification | | | Whisper-tiny | WhisperForConditionalGeneration | | | Whisper-base | WhisperForConditionalGeneration | | | Whisper-small | WhisperForConditionalGeneration | | | Whisper-medium | WhisperForConditionalGeneration | | | Whisper-large-v3 | WhisperForConditionalGeneration | | | Whisper-large-v3-turbo | WhisperForConditionalGeneration | | | Wav2Vec2 | Wav2Vec2ForCTC | | | Audio-Spectogram-Transformer | ASTForAudioClassification | | | DPT-large | DPTForDepthEstimation | | | ViT-large | ViTForImageClassification | | | ResNet50 | ResNetForImageClassification | |

Multi-NPU (RSD)

Note

Rebellions Scalable Design (RSD) is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

| Model | Model Architecture | Recommended # of NPUs | Task | | :---: | :-----------------------------: | :-----: | :-----: | | DeepSeek-R1-Distill-Llama-8b | LlamaForCausalLM | 8 | | | DeepSeek-R1-Distill-Llama-70b | LlamaForCausalLM | 16 | | | DeepSeek-R1-Distill-Qwen-1.5b | Qwen2ForCausalLM | 8 | | | DeepSeek-R1-Distill-Qwen-7b | Qwen2ForCausalLM | 8 | | | DeepSeek-R1-Distill-Qwen-14b | Qwen2ForCausalLM | 8 | | | DeepSeek-R1-Distill-Qwen-32b | Qwen2ForCausalLM | 16 | | | Llama3.3-70b | LlamaForCausalLM | 16 | | | Llama3.2-3b | LlamaForCausalLM | 8 | | | Llama3.1-70b | LlamaForCausalLM | 16 | | | Llama3.1-8b | LlamaForCausalLM | 8 | | | Llama3-8b | LlamaForCausalLM | 4 | | | Llama3-8b + LoRA | LlamaForCausalLM | 4 | | | Llama2-7b | LlamaForCausalLM | 4 | | | Llama2-13b | LlamaForCausalLM | 8 | | | Gemma-7b | GemmaForCausalLM | 4 | | | Mistral-7b | MistralForCausalLM | 4 | | | Qwen2-7b | Qwen2ForCausalLM | 4 | | | Qwen2.5-7b | Qwen2ForCausalLM | 4 | | | Qwen2.5-14b | Qwen2ForCausalLM | 8 | | | Salamandra-7b | LlamaForCausalLM | 4 | | | KONI-Llama3.1-8b | LlamaForCausalLM | 8 | | | EXAONE-3.0-7.8b | ExaoneForCausalLM | 4 | | | EXAONE-3.5-2.4b | ExaoneForCausalLM | 4 | | | EXAONE-3.5-7.8b | ExaoneForCausalLM | 8 | | | Mi:dm-7b | MidmLMHeadModel | 4 | | | SOLAR-10.7b | LlamaForCausalLM | 8 | | | EEVE-Korean-10.8b | LlamaForCausalLM | 8 | | | Llava-v1.6-mistral-7b | LlavaNextForConditionalGeneration | 4 | |

Diffusers

Note

Models marked with a superscript, , require more than one ATOM due to their large weight size exceeding the capacity of a single ATOM. This necessitates dividing the model's modules across multiple ATOMs. For detailed information regarding the specific module distribution, please refer to the model code.

| Model | Model Architecture | Task | | :---: | :-----------------------------: | :-----: | | Stable Diffusion |
  • StableDiffusionPipeline
  • StableDiffusionImg2ImgPipeline
  • StableDiffusionInpaintPipeline
| | | Stable Diffusion + LoRA |
  • StableDiffusionPipeline
| | | Stable Diffusion V3 |
  • StableDiffusion3Pipeline
  • StableDiffusion3Img2ImgPipeline
  • StableDiffusion3InpaintPipeline
| | | Stable Diffusion XL |
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
  • StableDiffusionXLInpaintPipeline
| | | Stable Diffusion XL + multi-LoRA |
  • StableDiffusionXLPipeline
| | | SDXL-turbo |
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
| | | Stable Diffusion + ControlNet |
  • StableDiffusionControlNetPipeline
  • StableDiffusionControlNetImg2ImgPipeline
| | | Stable Diffusion XL + ControlNet |
  • StableDiffusionXLControlNetPipeline
  • StableDiffusionXLControlNetImg2ImgPipeline
| | | Kandinsky V2.2 |
  • KandinskyV22InpaintCombinedPipeline
  • KandinskyV22PriorPipeline
  • KandinskyV22InpaintPipeline
| |