Skip to content

Optimum RBLN

Optimum RBLN serves as a bridge connecting the HuggingFace transformers/diffusers libraries to RBLN NPUs, i.e. ATOM™ (RBLN-CA02), ATOM™+ (RBLN-CA12 and RBLN-CA22), and ATOM™-Max (RBLN-CA25). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.

Transformers

Single NPU

Model Model Architecture Task
Phi-2 PhiForCausalLM Text Generation
Gemma-2b GemmaForCausalLM Text Generation
OPT-2.7b OPTForCausalLM Text Generation
GPT2 GPT2LMHeadModel Text Generation
GPT2-medium GPT2LMHeadModel Text Generation
GPT2-large GPT2LMHeadModel Text Generation
GPT2-xl GPT2LMHeadModel Text Generation
T5-small T5ForConditionalGeneration Text Generation
T5-base T5ForConditionalGeneration Text Generation
T5-large T5ForConditionalGeneration Text Generation
T5-3b T5ForConditionalGeneration Text Generation
BART-base BartForConditionalGeneration Text Generation
BART-large BartForConditionalGeneration Text Generation
KoBART-base BartForConditionalGeneration Text Generation
E5-base-4K BertModel Sentence Similarity
LaBSE BertModel Sentence Similarity
KR-SBERT-V40K-klueNLI-augSTS BertModel Sentence Similarity
BERT-base - BertForMaskedLM
- BertForQuestionAnswering
Masked Language ModelingQuestion Answering
BERT-large - BertForMaskedLM
- BertForQuestionAnswering
Masked Language ModelingQuestion Answering
DistilBERT-base DistilBertForQuestionAnswering Question Answering
SecureBERT RobertaForMaskedLM Masked Language Modeling
RoBERTa RobertaForSequenceClassification Text Classification
BGE-Small-EN-v1.5 RBLNBertModel Sentence Similarity
BGE-Base-EN-v1.5 RBLNBertModel Sentence Similarity
BGE-Large-EN-v1.5 RBLNBertModel Sentence Similarity
BGE-M3 XLMRobertaModel Sentence Similarity
BGE-Reranker-V2-M3 XLMRobertaForSequenceClassification Sentence Similarity
BGE-Reranker-Base XLMRobertaForSequenceClassification Sentence Similarity
BGE-Reranker-Large XLMRobertaForSequenceClassification Sentence Similarity
Ko-Reranker XLMRobertaForSequenceClassification Sentence Similarity
Time-Series-Transformer TimeSeriesTransformerForPrediction Time-series Forecasting
BLIP2-2.7b RBLNBlip2ForConditionalGeneration Image Captioning
Whisper-tiny WhisperForConditionalGeneration Speech to Text
Whisper-base WhisperForConditionalGeneration Speech to Text
Whisper-small WhisperForConditionalGeneration Speech to Text
Whisper-medium WhisperForConditionalGeneration Speech to Text
Whisper-large-v3 WhisperForConditionalGeneration Speech to Text
Whisper-large-v3-turbo WhisperForConditionalGeneration Speech to Text
Wav2Vec2 Wav2Vec2ForCTC Speech to Text
Audio-Spectogram-Transformer ASTForAudioClassification Audio Classification
DPT-large DPTForDepthEstimation Monocular Depth Estimation
ViT-large ViTForImageClassification Image Classification
ResNet50 ResNetForImageClassification Image Classification

Multi-NPU (RSD)

Note

Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA12 and RBLN-CA22) and ATOM™-Max (RBLN-CA25). You can check the type of your current RBLN NPU using the rbln-stat command.

Model Model Architecture Recommended # of NPUs Task
DeepSeek-R1-Distill-Llama-8b LlamaForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Llama-70b LlamaForCausalLM 16 Text Generation
DeepSeek-R1-Distill-Qwen-1.5b Qwen2ForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Qwen-7b Qwen2ForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Qwen-14b Qwen2ForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Qwen-32b Qwen2ForCausalLM 16 Text Generation
Llama3.3-70b LlamaForCausalLM 16 Text Generation
Llama3.2-3b LlamaForCausalLM 8 Text Generation
Llama3.1-70b LlamaForCausalLM 16 Text Generation
Llama3.1-8b LlamaForCausalLM 8 Text Generation
Llama3-8b LlamaForCausalLM 4 or 8 Text Generation
Llama3-8b + LoRA LlamaForCausalLM 4 or 8 Text Generation
Llama2-7b LlamaForCausalLM 4 or 8 Text Generation
Llama2-13b LlamaForCausalLM 4 or 8 Text Generation
Gemma-7b GemmaForCausalLM 4 or 8 Text Generation
Mistral-7b MistralForCausalLM 4 or 8 Text Generation
A.X-4.0-Light Qwen2ForCausalLM 4 or 8 Text Generation
Qwen2-7b Qwen2ForCausalLM 4 or 8 Text Generation
Qwen2.5-7b Qwen2ForCausalLM 4 or 8 Text Generation
Qwen2.5-14b Qwen2ForCausalLM 4 or 8 Text Generation
Midm-2.0-Mini LlamaForCausalLM 2 or 4 Text Generation
Midm-2.0-Base LlamaForCausalLM 4 or 8 Text Generation
Salamandra-7b LlamaForCausalLM 4 or 8 Text Generation
KONI-Llama3.1-8b LlamaForCausalLM 8 Text Generation
EXAONE-3.0-7.8b ExaoneForCausalLM 4 or 8 Text Generation
EXAONE-3.5-2.4b ExaoneForCausalLM 4 Text Generation
EXAONE-3.5-7.8b ExaoneForCausalLM 4 or 8 Text Generation
EXAONE-3.5-32b ExaoneForCausalLM 8 or 16 Text Generation
OPT-6.7b OPTForCausalLM 4 Text Generation
SOLAR-10.7b LlamaForCausalLM 4 or 8 Text Generation
EEVE-Korean-10.8b LlamaForCausalLM 4 or 8 Text Generation
T5-11b T5ForConditionalGeneration 2 or 4 Text Generation
T5-Enc-11b T5EncoderModel 2 or 4 Sentence Similarity
Gemma3-4b Gemma3ForConditionalGeneration 8 Image Captioning
Gemma3-12b Gemma3ForConditionalGeneration 8 or 16 Image Captioning
Gemma3-27b Gemma3ForConditionalGeneration 16 Image Captioning
Qwen2.5-VL-7b Qwen2_5_VLForConditionalGeneration 8 Image Captioning
Idefics3-8B-Llama3 Idefics3ForConditionalGeneration 8 Image Captioning
Llava-v1.6-mistral-7b LlavaNextForConditionalGeneration 4 or 8 Image Captioning
BLIP2-6.7b RBLNBlip2ForConditionalGeneration 4 Image Captioning
ColPali-v1.3 RBLNColPaliForRetrieval 4 Visual Document Retrieval

Diffusers

Single NPU

Note

Models marked with a superscript, , require more than one ATOM™ due to their large weight size exceeding the capacity of a single ATOM™. This necessitates dividing the model's modules across multiple ATOM™s. For detailed information regarding the specific module distribution, please refer to the model code.

Model Model Architecture Task
Stable Diffusion
  • StableDiffusionPipeline
  • StableDiffusionImg2ImgPipeline
  • StableDiffusionInpaintPipeline
Stable Diffusion + LoRA
  • StableDiffusionPipeline
Stable Diffusion V3
  • StableDiffusion3Pipeline
  • StableDiffusion3Img2ImgPipeline
  • StableDiffusion3InpaintPipeline
Stable Diffusion XL
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
  • StableDiffusionXLInpaintPipeline
Stable Diffusion XL + multi-LoRA
  • StableDiffusionXLPipeline
SDXL-turbo
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
Stable Diffusion + ControlNet
  • StableDiffusionControlNetPipeline
  • StableDiffusionControlNetImg2ImgPipeline
Stable Diffusion XL + ControlNet
  • StableDiffusionXLControlNetPipeline
  • StableDiffusionXLControlNetImg2ImgPipeline
Kandinsky V2.2
  • KandinskyV22PriorPipeline
  • KandinskyV22Pipeline
  • KandinskyV22Img2ImgPipeline
  • KandinskyV22InpaintPipeline
  • KandinskyV22CombinedPipeline
  • KandinskyV22Img2ImgCombinedPipeline
  • KandinskyV22InpaintCombinedPipeline

Multi-NPU (RSD)

Note

Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA12 and RBLN-CA22) and ATOM™-Max (RBLN-CA25). You can check the type of your current RBLN NPU using the rbln-stat command.

Note

Models marked with have large submodules that require distribution across multiple ATOM™ instances. For distribution details, refer to the model's example code.

Model Model Architecture Recommended # of NPUs Task
Cosmos-Predict1-7B-Text2World CosmosTextToWorldPipeline 4 Text to Video
Cosmos-Predict1-14B-Text2World CosmosTextToWorldPipeline 4 Text to Video
Cosmos-Predict1-7B-Video2World CosmosVideoToWorldPipeline 4 Video to Video
Cosmos-Predict1-14B-Video2World CosmosVideoToWorldPipeline 4 Video to Video