Skip to content

Optimum RBLN

Optimum RBLN serves as a bridge connecting the HuggingFace transformers/diffusers libraries to RBLN NPUs, i.e. ATOM (RBLN-CA02) and ATOM+ (RBLN-CA12). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.

Transformers

Single NPU

Model Model Architecture Task
Phi-2 PhiForCausalLM
Gemma-2b GemmaForCausalLM
GPT2 GPT2LMHeadModel
GPT2-medium GPT2LMHeadModel
GPT2-large GPT2LMHeadModel
GPT2-xl GPT2LMHeadModel
T5-small T5ForConditionalGeneration
T5-base T5ForConditionalGeneration
T5-large T5ForConditionalGeneration
T5-3b T5ForConditionalGeneration
BART-base BartForConditionalGeneration
BART-large BartForConditionalGeneration
KoBART-base BartForConditionalGeneration
E5-base-4K BertModel
LaBSE BertModel
KR-SBERT-V40K-klueNLI-augSTS BertModel
BERT-base - BertForMaskedLM
- BertForQuestionAnswering
BERT-large - BertForMaskedLM
- BertForQuestionAnswering
DistilBERT-base DistilBertForQuestionAnswering
SecureBERT RobertaForMaskedLM
RoBERTa RobertaForSequenceClassification
BGE-Small-EN-v1.5 RBLNBertModel
BGE-Base-EN-v1.5 RBLNBertModel
BGE-Large-EN-v1.5 RBLNBertModel
BGE-M3 XLMRobertaModel
BGE-Reranker-V2-M3 XLMRobertaForSequenceClassification
BGE-Reranker-Base XLMRobertaForSequenceClassification
BGE-Reranker-Large XLMRobertaForSequenceClassification
Ko-Reranker XLMRobertaForSequenceClassification
Whisper-tiny WhisperForConditionalGeneration
Whisper-base WhisperForConditionalGeneration
Whisper-small WhisperForConditionalGeneration
Whisper-medium WhisperForConditionalGeneration
Whisper-large-v3 WhisperForConditionalGeneration
Whisper-large-v3-turbo WhisperForConditionalGeneration
Wav2Vec2 Wav2Vec2ForCTC
Audio-Spectogram-Transformer ASTForAudioClassification
DPT-large DPTForDepthEstimation
ViT-large ViTForImageClassification
ResNet50 ResNetForImageClassification

Multi-NPU (RSD)

Note

Rebellions Scalable Design (RSD) is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

Model Model Architecture Recommended # of NPUs Task
DeepSeek-R1-Distill-Llama-8b LlamaForCausalLM 8
DeepSeek-R1-Distill-Llama-70b LlamaForCausalLM 16
DeepSeek-R1-Distill-Qwen-1.5b Qwen2ForCausalLM 8
DeepSeek-R1-Distill-Qwen-7b Qwen2ForCausalLM 8
DeepSeek-R1-Distill-Qwen-14b Qwen2ForCausalLM 8
DeepSeek-R1-Distill-Qwen-32b Qwen2ForCausalLM 16
Llama3.3-70b LlamaForCausalLM 16
Llama3.2-3b LlamaForCausalLM 8
Llama3.1-70b LlamaForCausalLM 16
Llama3.1-8b LlamaForCausalLM 8
Llama3-8b LlamaForCausalLM 4 or 8
Llama3-8b + LoRA LlamaForCausalLM 4 or 8
Llama2-7b LlamaForCausalLM 4 or 8
Llama2-13b LlamaForCausalLM 4 or 8
Gemma-7b GemmaForCausalLM 4 or 8
Mistral-7b MistralForCausalLM 4 or 8
Qwen2-7b Qwen2ForCausalLM 4 or 8
Qwen2.5-7b Qwen2ForCausalLM 4 or 8
Qwen2.5-14b Qwen2ForCausalLM 4 or 8
Salamandra-7b LlamaForCausalLM 4 or 8
KONI-Llama3.1-8b LlamaForCausalLM 8
EXAONE-3.0-7.8b ExaoneForCausalLM 4 or 8
EXAONE-3.5-2.4b ExaoneForCausalLM 4
EXAONE-3.5-7.8b ExaoneForCausalLM 4 or 8
EXAONE-3.5-32b ExaoneForCausalLM 8 or 16
Mi:dm-7b MidmLMHeadModel 4 or 8
SOLAR-10.7b LlamaForCausalLM 4 or 8
EEVE-Korean-10.8b LlamaForCausalLM 4 or 8
Llava-v1.6-mistral-7b LlavaNextForConditionalGeneration 4 or 8

Diffusers

Note

Models marked with a superscript, , require more than one ATOM due to their large weight size exceeding the capacity of a single ATOM. This necessitates dividing the model's modules across multiple ATOMs. For detailed information regarding the specific module distribution, please refer to the model code.

Model Model Architecture Task
Stable Diffusion
  • StableDiffusionPipeline
  • StableDiffusionImg2ImgPipeline
  • StableDiffusionInpaintPipeline
Stable Diffusion + LoRA
  • StableDiffusionPipeline
Stable Diffusion V3
  • StableDiffusion3Pipeline
  • StableDiffusion3Img2ImgPipeline
  • StableDiffusion3InpaintPipeline
Stable Diffusion XL
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
  • StableDiffusionXLInpaintPipeline
Stable Diffusion XL + multi-LoRA
  • StableDiffusionXLPipeline
SDXL-turbo
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
Stable Diffusion + ControlNet
  • StableDiffusionControlNetPipeline
  • StableDiffusionControlNetImg2ImgPipeline
Stable Diffusion XL + ControlNet
  • StableDiffusionXLControlNetPipeline
  • StableDiffusionXLControlNetImg2ImgPipeline
Kandinsky V2.2
  • KandinskyV22PriorPipeline
  • KandinskyV22Pipeline
  • KandinskyV22Img2ImgPipeline
  • KandinskyV22InpaintPipeline
  • KandinskyV22CombinedPipeline
  • KandinskyV22Img2ImgCombinedPipeline
  • KandinskyV22InpaintCombinedPipeline