Optimum RBLN¶
Optimum RBLN serves as a bridge connecting the HuggingFace transformers
/diffusers
libraries to RBLN NPUs, i.e. ATOM™ (RBLN-CA02
), ATOM™+ (RBLN-CA12
and RBLN-CA22
), and ATOM™-Max (RBLN-CA25
). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.
Transformers¶
Single NPU¶
Model | Model Architecture | Task |
---|---|---|
Phi-2 | PhiForCausalLM | Text Generation |
Gemma-2b | GemmaForCausalLM | Text Generation |
OPT-2.7b | OPTForCausalLM | Text Generation |
GPT2 | GPT2LMHeadModel | Text Generation |
GPT2-medium | GPT2LMHeadModel | Text Generation |
GPT2-large | GPT2LMHeadModel | Text Generation |
GPT2-xl | GPT2LMHeadModel | Text Generation |
T5-small | T5ForConditionalGeneration | Text Generation |
T5-base | T5ForConditionalGeneration | Text Generation |
T5-large | T5ForConditionalGeneration | Text Generation |
T5-3b | T5ForConditionalGeneration | Text Generation |
BART-base | BartForConditionalGeneration | Text Generation |
BART-large | BartForConditionalGeneration | Text Generation |
KoBART-base | BartForConditionalGeneration | Text Generation |
E5-base-4K | BertModel | Sentence Similarity |
LaBSE | BertModel | Sentence Similarity |
KR-SBERT-V40K-klueNLI-augSTS | BertModel | Sentence Similarity |
BERT-base | - BertForMaskedLM - BertForQuestionAnswering |
Masked Language ModelingQuestion Answering |
BERT-large | - BertForMaskedLM - BertForQuestionAnswering |
Masked Language ModelingQuestion Answering |
DistilBERT-base | DistilBertForQuestionAnswering | Question Answering |
SecureBERT | RobertaForMaskedLM | Masked Language Modeling |
RoBERTa | RobertaForSequenceClassification | Text Classification |
BGE-Small-EN-v1.5 | RBLNBertModel | Sentence Similarity |
BGE-Base-EN-v1.5 | RBLNBertModel | Sentence Similarity |
BGE-Large-EN-v1.5 | RBLNBertModel | Sentence Similarity |
BGE-M3 | XLMRobertaModel | Sentence Similarity |
BGE-Reranker-V2-M3 | XLMRobertaForSequenceClassification | Sentence Similarity |
BGE-Reranker-Base | XLMRobertaForSequenceClassification | Sentence Similarity |
BGE-Reranker-Large | XLMRobertaForSequenceClassification | Sentence Similarity |
Ko-Reranker | XLMRobertaForSequenceClassification | Sentence Similarity |
Time-Series-Transformer | TimeSeriesTransformerForPrediction | Time-series Forecasting |
BLIP2-2.7b | RBLNBlip2ForConditionalGeneration | Image Captioning |
Whisper-tiny | WhisperForConditionalGeneration | Speech to Text |
Whisper-base | WhisperForConditionalGeneration | Speech to Text |
Whisper-small | WhisperForConditionalGeneration | Speech to Text |
Whisper-medium | WhisperForConditionalGeneration | Speech to Text |
Whisper-large-v3 | WhisperForConditionalGeneration | Speech to Text |
Whisper-large-v3-turbo | WhisperForConditionalGeneration | Speech to Text |
Wav2Vec2 | Wav2Vec2ForCTC | Speech to Text |
Audio-Spectogram-Transformer | ASTForAudioClassification | Audio Classification |
DPT-large | DPTForDepthEstimation | Monocular Depth Estimation |
ViT-large | ViTForImageClassification | Image Classification |
ResNet50 | ResNetForImageClassification | Image Classification |
Multi-NPU (RSD)¶
Note
Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA12
and RBLN-CA22
) and ATOM™-Max (RBLN-CA25
). You can check the type of your current RBLN NPU using the rbln-stat
command.
Model | Model Architecture | Recommended # of NPUs | Task |
---|---|---|---|
DeepSeek-R1-Distill-Llama-8b | LlamaForCausalLM | 8 | Text Generation |
DeepSeek-R1-Distill-Llama-70b | LlamaForCausalLM | 16 | Text Generation |
DeepSeek-R1-Distill-Qwen-1.5b | Qwen2ForCausalLM | 8 | Text Generation |
DeepSeek-R1-Distill-Qwen-7b | Qwen2ForCausalLM | 8 | Text Generation |
DeepSeek-R1-Distill-Qwen-14b | Qwen2ForCausalLM | 8 | Text Generation |
DeepSeek-R1-Distill-Qwen-32b | Qwen2ForCausalLM | 16 | Text Generation |
Llama3.3-70b | LlamaForCausalLM | 16 | Text Generation |
Llama3.2-3b | LlamaForCausalLM | 8 | Text Generation |
Llama3.1-70b | LlamaForCausalLM | 16 | Text Generation |
Llama3.1-8b | LlamaForCausalLM | 8 | Text Generation |
Llama3-8b | LlamaForCausalLM | 4 or 8 | Text Generation |
Llama3-8b + LoRA | LlamaForCausalLM | 4 or 8 | Text Generation |
Llama2-7b | LlamaForCausalLM | 4 or 8 | Text Generation |
Llama2-13b | LlamaForCausalLM | 4 or 8 | Text Generation |
Gemma-7b | GemmaForCausalLM | 4 or 8 | Text Generation |
Mistral-7b | MistralForCausalLM | 4 or 8 | Text Generation |
A.X-4.0-Light | Qwen2ForCausalLM | 4 or 8 | Text Generation |
Qwen2-7b | Qwen2ForCausalLM | 4 or 8 | Text Generation |
Qwen2.5-7b | Qwen2ForCausalLM | 4 or 8 | Text Generation |
Qwen2.5-14b | Qwen2ForCausalLM | 4 or 8 | Text Generation |
Midm-2.0-Mini | LlamaForCausalLM | 2 or 4 | Text Generation |
Midm-2.0-Base | LlamaForCausalLM | 4 or 8 | Text Generation |
Salamandra-7b | LlamaForCausalLM | 4 or 8 | Text Generation |
KONI-Llama3.1-8b | LlamaForCausalLM | 8 | Text Generation |
EXAONE-3.0-7.8b | ExaoneForCausalLM | 4 or 8 | Text Generation |
EXAONE-3.5-2.4b | ExaoneForCausalLM | 4 | Text Generation |
EXAONE-3.5-7.8b | ExaoneForCausalLM | 4 or 8 | Text Generation |
EXAONE-3.5-32b | ExaoneForCausalLM | 8 or 16 | Text Generation |
OPT-6.7b | OPTForCausalLM | 4 | Text Generation |
SOLAR-10.7b | LlamaForCausalLM | 4 or 8 | Text Generation |
EEVE-Korean-10.8b | LlamaForCausalLM | 4 or 8 | Text Generation |
T5-11b | T5ForConditionalGeneration | 2 or 4 | Text Generation |
T5-Enc-11b | T5EncoderModel | 2 or 4 | Sentence Similarity |
Gemma3-4b | Gemma3ForConditionalGeneration | 8 | Image Captioning |
Gemma3-12b | Gemma3ForConditionalGeneration | 8 or 16 | Image Captioning |
Gemma3-27b | Gemma3ForConditionalGeneration | 16 | Image Captioning |
Qwen2.5-VL-7b | Qwen2_5_VLForConditionalGeneration | 8 | Image Captioning |
Idefics3-8B-Llama3 | Idefics3ForConditionalGeneration | 8 | Image Captioning |
Llava-v1.6-mistral-7b | LlavaNextForConditionalGeneration | 4 or 8 | Image Captioning |
BLIP2-6.7b | RBLNBlip2ForConditionalGeneration | 4 | Image Captioning |
ColPali-v1.3 | RBLNColPaliForRetrieval | 4 | Visual Document Retrieval |
Diffusers¶
Single NPU¶
Note
Models marked with a superscript, †
, require more than one ATOM™ due to their large weight size exceeding the capacity of a single ATOM™. This necessitates dividing the model's modules across multiple ATOM™s. For detailed information regarding the specific module distribution, please refer to the model code.
Model | Model Architecture | Task |
---|---|---|
Stable Diffusion |
|
|
Stable Diffusion + LoRA |
|
|
Stable Diffusion V3† |
|
|
Stable Diffusion XL |
|
|
Stable Diffusion XL + multi-LoRA |
|
|
SDXL-turbo |
|
|
Stable Diffusion + ControlNet |
|
|
Stable Diffusion XL + ControlNet |
|
|
Kandinsky V2.2 |
|
Multi-NPU (RSD)¶
Note
Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA12
and RBLN-CA22
) and ATOM™-Max (RBLN-CA25
). You can check the type of your current RBLN NPU using the rbln-stat
command.
Note
Models marked with †
have large submodules that require distribution across multiple ATOM™ instances. For distribution details, refer to the model's example code.
Model | Model Architecture | Recommended # of NPUs | Task |
---|---|---|---|
Cosmos-Predict1-7B-Text2World† | CosmosTextToWorldPipeline | 4 | Text to Video |
Cosmos-Predict1-14B-Text2World† | CosmosTextToWorldPipeline | 4 | Text to Video |
Cosmos-Predict1-7B-Video2World† | CosmosVideoToWorldPipeline | 4 | Video to Video |
Cosmos-Predict1-14B-Video2World† | CosmosVideoToWorldPipeline | 4 | Video to Video |