Skip to content

Optimum RBLN

Optimum RBLN serves as a bridge connecting the HuggingFace transformers/diffusers libraries to RBLN NPUs, i.e. ATOM™ (RBLN-CA02), ATOM™+ (RBLN-CA12 and RBLN-CA22), and ATOM™-Max (RBLN-CA25). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.

Transformers

Single NPU

Model Model Architecture Task
Phi-2 PhiForCausalLM
Gemma-2b GemmaForCausalLM
OPT-2.7b OPTForCausalLM
GPT2 GPT2LMHeadModel
GPT2-medium GPT2LMHeadModel
GPT2-large GPT2LMHeadModel
GPT2-xl GPT2LMHeadModel
T5-small T5ForConditionalGeneration
T5-base T5ForConditionalGeneration
T5-large T5ForConditionalGeneration
T5-3b T5ForConditionalGeneration
BART-base BartForConditionalGeneration
BART-large BartForConditionalGeneration
KoBART-base BartForConditionalGeneration
E5-base-4K BertModel
LaBSE BertModel
KR-SBERT-V40K-klueNLI-augSTS BertModel
BERT-base - BertForMaskedLM
- BertForQuestionAnswering
BERT-large - BertForMaskedLM
- BertForQuestionAnswering
DistilBERT-base DistilBertForQuestionAnswering
SecureBERT RobertaForMaskedLM
RoBERTa RobertaForSequenceClassification
BGE-Small-EN-v1.5 RBLNBertModel
BGE-Base-EN-v1.5 RBLNBertModel
BGE-Large-EN-v1.5 RBLNBertModel
BGE-M3 XLMRobertaModel
BGE-Reranker-V2-M3 XLMRobertaForSequenceClassification
BGE-Reranker-Base XLMRobertaForSequenceClassification
BGE-Reranker-Large XLMRobertaForSequenceClassification
Ko-Reranker XLMRobertaForSequenceClassification
Time-Series-Transformer TimeSeriesTransformerForPrediction
BLIP2-2.7b RBLNBlip2ForConditionalGeneration
Whisper-tiny WhisperForConditionalGeneration
Whisper-base WhisperForConditionalGeneration
Whisper-small WhisperForConditionalGeneration
Whisper-medium WhisperForConditionalGeneration
Whisper-large-v3 WhisperForConditionalGeneration
Whisper-large-v3-turbo WhisperForConditionalGeneration
Wav2Vec2 Wav2Vec2ForCTC
Audio-Spectogram-Transformer ASTForAudioClassification
DPT-large DPTForDepthEstimation
ViT-large ViTForImageClassification
ResNet50 ResNetForImageClassification

Multi-NPU (RSD)

Note

Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA12 and RBLN-CA22) and ATOM™-Max (RBLN-CA25). You can check the type of your current RBLN NPU using the rbln-stat command.

Model Model Architecture Recommended # of NPUs Task
DeepSeek-R1-Distill-Llama-8b LlamaForCausalLM 8
DeepSeek-R1-Distill-Llama-70b LlamaForCausalLM 16
DeepSeek-R1-Distill-Qwen-1.5b Qwen2ForCausalLM 8
DeepSeek-R1-Distill-Qwen-7b Qwen2ForCausalLM 8
DeepSeek-R1-Distill-Qwen-14b Qwen2ForCausalLM 8
DeepSeek-R1-Distill-Qwen-32b Qwen2ForCausalLM 16
Llama3.3-70b LlamaForCausalLM 16
Llama3.2-3b LlamaForCausalLM 8
Llama3.1-70b LlamaForCausalLM 16
Llama3.1-8b LlamaForCausalLM 8
Llama3-8b LlamaForCausalLM 4 or 8
Llama3-8b + LoRA LlamaForCausalLM 4 or 8
Llama2-7b LlamaForCausalLM 4 or 8
Llama2-13b LlamaForCausalLM 4 or 8
Gemma-7b GemmaForCausalLM 4 or 8
Mistral-7b MistralForCausalLM 4 or 8
Qwen2-7b Qwen2ForCausalLM 4 or 8
Qwen2.5-7b Qwen2ForCausalLM 4 or 8
Qwen2.5-14b Qwen2ForCausalLM 4 or 8
OPT-6.7b OPTForCausalLM 4
Salamandra-7b LlamaForCausalLM 4 or 8
KONI-Llama3.1-8b LlamaForCausalLM 8
EXAONE-3.0-7.8b ExaoneForCausalLM 4 or 8
EXAONE-3.5-2.4b ExaoneForCausalLM 4
EXAONE-3.5-7.8b ExaoneForCausalLM 4 or 8
EXAONE-3.5-32b ExaoneForCausalLM 8 or 16
Mi:dm-7b MidmLMHeadModel 4 or 8
SOLAR-10.7b LlamaForCausalLM 4 or 8
EEVE-Korean-10.8b LlamaForCausalLM 4 or 8
T5-11b T5ForConditionalGeneration 2 or 4
T5-Enc-11b T5EncoderModel 2 or 4
Gemma3-27b Gemma3ForConditionalGeneration 16
Qwen2.5-VL-7b Qwen2_5_VLForConditionalGeneration 8
Idefics3-8B-Llama3 Idefics3ForConditionalGeneration 8
Llava-v1.6-mistral-7b LlavaNextForConditionalGeneration 4 or 8
BLIP2-6.7b RBLNBlip2ForConditionalGeneration 4

Diffusers

Note

Models marked with a superscript, , require more than one ATOM™ due to their large weight size exceeding the capacity of a single ATOM™. This necessitates dividing the model's modules across multiple ATOM™s. For detailed information regarding the specific module distribution, please refer to the model code.

Model Model Architecture Task
Stable Diffusion
  • StableDiffusionPipeline
  • StableDiffusionImg2ImgPipeline
  • StableDiffusionInpaintPipeline
Stable Diffusion + LoRA
  • StableDiffusionPipeline
Stable Diffusion V3
  • StableDiffusion3Pipeline
  • StableDiffusion3Img2ImgPipeline
  • StableDiffusion3InpaintPipeline
Stable Diffusion XL
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
  • StableDiffusionXLInpaintPipeline
Stable Diffusion XL + multi-LoRA
  • StableDiffusionXLPipeline
SDXL-turbo
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
Stable Diffusion + ControlNet
  • StableDiffusionControlNetPipeline
  • StableDiffusionControlNetImg2ImgPipeline
Stable Diffusion XL + ControlNet
  • StableDiffusionXLControlNetPipeline
  • StableDiffusionXLControlNetImg2ImgPipeline
Kandinsky V2.2
  • KandinskyV22PriorPipeline
  • KandinskyV22Pipeline
  • KandinskyV22Img2ImgPipeline
  • KandinskyV22InpaintPipeline
  • KandinskyV22CombinedPipeline
  • KandinskyV22Img2ImgCombinedPipeline
  • KandinskyV22InpaintCombinedPipeline