Optimum RBLN¶

Optimum RBLN serves as a bridge connecting the HuggingFace transformers/diffusers libraries to RBLN NPUs, i.e. ATOM™ (RBLN-CA02), ATOM™+ (RBLN-CA12 and RBLN-CA22), and ATOM™-Max (RBLN-CA25). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.

Transformers¶

Single NPU¶

Model	Model Architecture	Task
Phi-2	PhiForCausalLM	Text Generation
Gemma-2b	GemmaForCausalLM	Text Generation
OPT-2.7b	OPTForCausalLM	Text Generation
GPT2	GPT2LMHeadModel	Text Generation
GPT2-medium	GPT2LMHeadModel	Text Generation
GPT2-large	GPT2LMHeadModel	Text Generation
GPT2-xl	GPT2LMHeadModel	Text Generation
T5-small	T5ForConditionalGeneration	Text Generation
T5-base	T5ForConditionalGeneration	Text Generation
T5-large	T5ForConditionalGeneration	Text Generation
T5-3b	T5ForConditionalGeneration	Text Generation
BART-base	BartForConditionalGeneration	Text Generation
BART-large	BartForConditionalGeneration	Text Generation
KoBART-base	BartForConditionalGeneration	Text Generation
E5-base-4K	BertModel	Sentence Similarity
LaBSE	BertModel	Sentence Similarity
KR-SBERT-V40K-klueNLI-augSTS	BertModel	Sentence Similarity
BERT-base	- BertForMaskedLM - BertForQuestionAnswering	Masked Language Modeling Question Answering
BERT-large	- BertForMaskedLM - BertForQuestionAnswering	Masked Language Modeling Question Answering
DistilBERT-base	DistilBertForQuestionAnswering	Question Answering
SecureBERT	RobertaForMaskedLM	Masked Language Modeling
RoBERTa	RobertaForSequenceClassification	Text Classification
BGE-Small-EN-v1.5	RBLNBertModel	Sentence Similarity
BGE-Base-EN-v1.5	RBLNBertModel	Sentence Similarity
BGE-Large-EN-v1.5	RBLNBertModel	Sentence Similarity
BGE-M3	XLMRobertaModel	Sentence Similarity
BGE-Reranker-V2-M3	XLMRobertaForSequenceClassification	Sentence Similarity
BGE-Reranker-Base	XLMRobertaForSequenceClassification	Sentence Similarity
BGE-Reranker-Large	XLMRobertaForSequenceClassification	Sentence Similarity
Ko-Reranker	XLMRobertaForSequenceClassification	Sentence Similarity
Time-Series-Transformer	TimeSeriesTransformerForPrediction	Time-series Forecasting
BLIP2-2.7b	RBLNBlip2ForConditionalGeneration	Image Captioning
Whisper-tiny	WhisperForConditionalGeneration	Speech to Text
Whisper-base	WhisperForConditionalGeneration	Speech to Text
Whisper-small	WhisperForConditionalGeneration	Speech to Text
Whisper-medium	WhisperForConditionalGeneration	Speech to Text
Whisper-large-v3	WhisperForConditionalGeneration	Speech to Text
Whisper-large-v3-turbo	WhisperForConditionalGeneration	Speech to Text
Wav2Vec2	Wav2Vec2ForCTC	Speech to Text
Audio-Spectogram-Transformer	ASTForAudioClassification	Audio Classification
DPT-large	DPTForDepthEstimation	Monocular Depth Estimation
ViT-large	ViTForImageClassification	Image Classification
ResNet50	ResNetForImageClassification	Image Classification

Multi-NPU (RSD)¶

Note

Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA12 and RBLN-CA22) and ATOM™-Max (RBLN-CA25). You can check the type of your current RBLN NPU using the rbln-stat command.

Model	Model Architecture	Recommended # of NPUs	Task
DeepSeek-R1-Distill-Llama-8b	LlamaForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Llama-70b	LlamaForCausalLM	16	Text Generation
DeepSeek-R1-Distill-Qwen-1.5b	Qwen2ForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Qwen-7b	Qwen2ForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Qwen-14b	Qwen2ForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Qwen-32b	Qwen2ForCausalLM	16	Text Generation
Llama3.3-70b	LlamaForCausalLM	16	Text Generation
Llama3.2-3b	LlamaForCausalLM	8	Text Generation
Llama3.1-70b	LlamaForCausalLM	16	Text Generation
Llama3.1-8b	LlamaForCausalLM	8	Text Generation
Llama3-8b	LlamaForCausalLM	4 or 8	Text Generation
Llama3-8b + LoRA	LlamaForCausalLM	4 or 8	Text Generation
Llama2-7b	LlamaForCausalLM	4 or 8	Text Generation
Llama2-13b	LlamaForCausalLM	4 or 8	Text Generation
Gemma-7b	GemmaForCausalLM	4 or 8	Text Generation
Mistral-7b	MistralForCausalLM	4 or 8	Text Generation
A.X-4.0-Light	Qwen2ForCausalLM	4 or 8	Text Generation
Qwen2-7b	Qwen2ForCausalLM	4 or 8	Text Generation
Qwen2.5-7b	Qwen2ForCausalLM	4 or 8	Text Generation
Qwen2.5-14b	Qwen2ForCausalLM	4 or 8	Text Generation
Midm-2.0-Mini	LlamaForCausalLM	2 or 4	Text Generation
Midm-2.0-Base	LlamaForCausalLM	4 or 8	Text Generation
Salamandra-7b	LlamaForCausalLM	4 or 8	Text Generation
KONI-Llama3.1-8b	LlamaForCausalLM	8	Text Generation
EXAONE-3.0-7.8b	ExaoneForCausalLM	4 or 8	Text Generation
EXAONE-3.5-2.4b	ExaoneForCausalLM	4	Text Generation
EXAONE-3.5-7.8b	ExaoneForCausalLM	4 or 8	Text Generation
EXAONE-3.5-32b	ExaoneForCausalLM	8 or 16	Text Generation
OPT-6.7b	OPTForCausalLM	4	Text Generation
SOLAR-10.7b	LlamaForCausalLM	4 or 8	Text Generation
EEVE-Korean-10.8b	LlamaForCausalLM	4 or 8	Text Generation
T5-11b	T5ForConditionalGeneration	2 or 4	Text Generation
T5-Enc-11b	T5EncoderModel	2 or 4	Sentence Similarity
Gemma3-4b	Gemma3ForConditionalGeneration	8	Image Captioning
Gemma3-12b	Gemma3ForConditionalGeneration	8 or 16	Image Captioning
Gemma3-27b	Gemma3ForConditionalGeneration	16	Image Captioning
Qwen2.5-VL-7b	Qwen2_5_VLForConditionalGeneration	8	Image Captioning
Idefics3-8B-Llama3	Idefics3ForConditionalGeneration	8	Image Captioning
Llava-v1.6-mistral-7b	LlavaNextForConditionalGeneration	4 or 8	Image Captioning
BLIP2-6.7b	RBLNBlip2ForConditionalGeneration	4	Image Captioning
ColPali-v1.3	RBLNColPaliForRetrieval	4	Visual Document Retrieval

Diffusers¶

Single NPU¶

Note

Models marked with a superscript, †, require more than one ATOM™ due to their large weight size exceeding the capacity of a single ATOM™. This necessitates dividing the model's modules across multiple ATOM™s. For detailed information regarding the specific module distribution, please refer to the model code.

Model	Model Architecture	Task
Stable Diffusion	StableDiffusionPipeline StableDiffusionImg2ImgPipeline StableDiffusionInpaintPipeline	Text to Image Image to Image Inpainting
Stable Diffusion + LoRA	StableDiffusionPipeline	Text to Image
Stable Diffusion V3^†	StableDiffusion3Pipeline StableDiffusion3Img2ImgPipeline StableDiffusion3InpaintPipeline	Text to Image Image to Image Inpainting
Stable Diffusion XL	StableDiffusionXLPipeline StableDiffusionXLImg2ImgPipeline StableDiffusionXLInpaintPipeline	Text to Image Image to Image Inpainting
Stable Diffusion XL + multi-LoRA	StableDiffusionXLPipeline	Text to Image
SDXL-turbo	StableDiffusionXLPipeline StableDiffusionXLImg2ImgPipeline	Text to Image Image to Image
Stable Diffusion + ControlNet	StableDiffusionControlNetPipeline StableDiffusionControlNetImg2ImgPipeline	Text to Image Image to Image
Stable Diffusion XL + ControlNet	StableDiffusionXLControlNetPipeline StableDiffusionXLControlNetImg2ImgPipeline	Text to Image Image to Image
Kandinsky V2.2	KandinskyV22PriorPipeline KandinskyV22Pipeline KandinskyV22Img2ImgPipeline KandinskyV22InpaintPipeline KandinskyV22CombinedPipeline KandinskyV22Img2ImgCombinedPipeline KandinskyV22InpaintCombinedPipeline	Text to Image Image to Image Inpainting Prior Generation

Multi-NPU (RSD)¶

Note

Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA12 and RBLN-CA22) and ATOM™-Max (RBLN-CA25). You can check the type of your current RBLN NPU using the rbln-stat command.

Note

Models marked with † have large submodules that require distribution across multiple ATOM™ instances. For distribution details, refer to the model's example code.

Model	Model Architecture	Recommended # of NPUs	Task
Cosmos-Predict1-7B-Text2World^†	CosmosTextToWorldPipeline	4	Text to Video
Cosmos-Predict1-14B-Text2World^†	CosmosTextToWorldPipeline	4	Text to Video
Cosmos-Predict1-7B-Video2World^†	CosmosVideoToWorldPipeline	4	Video to Video
Cosmos-Predict1-14B-Video2World^†	CosmosVideoToWorldPipeline	4	Video to Video