Optimum RBLN¶

Optimum RBLN serves as a bridge connecting the HuggingFace transformers/diffusers libraries to RBLN NPUs, i.e. ATOM™ (RBLN-CA02), ATOM™+ (RBLN-CA12 and RBLN-CA22), and ATOM™-Max (RBLN-CA25). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.

Transformers¶

Single NPU¶

Model	Model Architecture	Task
Phi-2	PhiForCausalLM	Text Generation
Gemma-2b	GemmaForCausalLM	Text Generation
OPT-2.7b	OPTForCausalLM	Text Generation
Qwen2.5-0.5b	Qwen2ForCausalLM	Text Generation
Qwen3-0.6b	Qwen3ForCausalLM	Text Generation
GPT2	GPT2LMHeadModel	Text Generation
GPT2-medium	GPT2LMHeadModel	Text Generation
GPT2-large	GPT2LMHeadModel	Text Generation
GPT2-xl	GPT2LMHeadModel	Text Generation
T5-small	T5ForConditionalGeneration	Text Generation
T5-base	T5ForConditionalGeneration	Text Generation
T5-large	T5ForConditionalGeneration	Text Generation
T5-3b	T5ForConditionalGeneration	Text Generation
BART-base	BartForConditionalGeneration	Text Generation
BART-large	BartForConditionalGeneration	Text Generation
KoBART-base	BartForConditionalGeneration	Text Generation
Pegasus	PegasusForConditionalGeneration	Text Generation
BERT-base	- BertForMaskedLM - BertForQuestionAnswering	Masked Language Modeling Question Answering
BERT-large	- BertForMaskedLM - BertForQuestionAnswering	Masked Language Modeling Question Answering
DistilBERT-base	DistilBertForQuestionAnswering	Question Answering
SecureBERT	RobertaForMaskedLM	Masked Language Modeling
RoBERTa	RobertaForSequenceClassification	Text Classification
Qwen3-Embedding-0.6b	Qwen3Model	Sentence Similarity
Qwen3-Reranker-0.6b	Qwen3ForCausalLM	Sentence Similarity
E5-base-4K	BertModel	Sentence Similarity
LaBSE	BertModel	Sentence Similarity
KR-SBERT-V40K-klueNLI-augSTS	BertModel	Sentence Similarity
BGE-Small-EN-v1.5	BertModel	Sentence Similarity
BGE-Base-EN-v1.5	BertModel	Sentence Similarity
BGE-Large-EN-v1.5	BertModel	Sentence Similarity
BGE-M3/Dense-Embedding	XLMRobertaModel	Sentence Similarity
BGE-M3/Multi-Vector	XLMRobertaModel	Sentence Similarity
BGE-M3/Sparse-Embedding	XLMRobertaModel	Sentence Similarity
BGE-Reranker-V2-M3	XLMRobertaForSequenceClassification	Sentence Similarity
BGE-Reranker-Base	XLMRobertaForSequenceClassification	Sentence Similarity
BGE-Reranker-Large	XLMRobertaForSequenceClassification	Sentence Similarity
Ko-Reranker	XLMRobertaForSequenceClassification	Sentence Similarity
Time-Series-Transformer	TimeSeriesTransformerForPrediction	Time-series Forecasting
BLIP2-2.7b	Blip2ForConditionalGeneration	Image Captioning
Whisper-tiny	WhisperForConditionalGeneration	Speech to Text
Whisper-base	WhisperForConditionalGeneration	Speech to Text
Whisper-small	WhisperForConditionalGeneration	Speech to Text
Whisper-medium	WhisperForConditionalGeneration	Speech to Text
Whisper-large-v3	WhisperForConditionalGeneration	Speech to Text
Whisper-large-v3-turbo	WhisperForConditionalGeneration	Speech to Text
Wav2Vec2	Wav2Vec2ForCTC	Speech to Text
Audio-Spectogram-Transformer	ASTForAudioClassification	Audio Classification
GroundingDino-Tiny	GroundingDinoForObjectDetection	Multi Modal
GroundingDino-Base	GroundingDinoForObjectDetection	Multi Modal
Depth-Anything-V2-Small	DepthAnythingForDepthEstimation	Monocular Depth Estimation
Depth-Anything-V2-Base	DepthAnythingForDepthEstimation	Monocular Depth Estimation
Depth-Anything-V2-Large	DepthAnythingForDepthEstimation	Monocular Depth Estimation
DepthAnythingV3-Small	-	Monocular Depth Estimation
DepthAnythingV3-Base	-	Monocular Depth Estimation
DepthAnythingV3-Large	-	Monocular Depth Estimation
DepthAnythingV3-Giant	-	Monocular Depth Estimation
DepthAnythingV3-Large-1.1	-	Monocular Depth Estimation
DepthAnythingV3-Giant-1.1	-	Monocular Depth Estimation
DPT-large	DPTForDepthEstimation	Monocular Depth Estimation
ViT-large	ViTForImageClassification	Image Classification
ResNet50	ResNetForImageClassification	Image Classification

Multi-NPU (RSD)¶

Note

Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA22) and ATOM™-Max (RBLN-CA25). RBLN-CA22 and RBLN-CA25 have 1 NPU chip and 4 NPU chips on the cards, respectively. You can check the type of your current RBLN NPU using the rbln-smi command.

Note

Models marked with † have large submodules that require distribution across multiple ATOM™ instances. For distribution details, refer to the model's example code.

Model	Model Architecture	Number of NPU chips	Task
gpt-oss-20b	GptOssForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Llama-8b	LlamaForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Llama-70b	LlamaForCausalLM	16	Text Generation
DeepSeek-R1-Distill-Qwen-1.5b	Qwen2ForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Qwen-7b	Qwen2ForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Qwen-14b	Qwen2ForCausalLM	8	Text Generation
DeepSeek-R1-Distill-Qwen-32b	Qwen2ForCausalLM	16	Text Generation
Llama3.3-70b	LlamaForCausalLM	16	Text Generation
Llama3.2-3b	LlamaForCausalLM	8	Text Generation
Llama3.1-70b	LlamaForCausalLM	16	Text Generation
Llama3.1-8b	LlamaForCausalLM	8	Text Generation
Llama3-8b	LlamaForCausalLM	4 or 8	Text Generation
Llama3-8b + LoRA	LlamaForCausalLM	4 or 8	Text Generation
Llama2-7b	LlamaForCausalLM	4 or 8	Text Generation
Llama2-13b	LlamaForCausalLM	4 or 8	Text Generation
Gemma-7b	GemmaForCausalLM	4 or 8	Text Generation
Gemma2-9b	Gemma2ForCausalLM	4 or 8	Text Generation
Mistral-7b	MistralForCausalLM	4 or 8	Text Generation
A.X-4.0-Light	Qwen2ForCausalLM	4 or 8	Text Generation
Qwen2-7b	Qwen2ForCausalLM	4 or 8	Text Generation
Qwen2.5-1.5b	Qwen2ForCausalLM	2	Text Generation
Qwen2.5-3b	Qwen2ForCausalLM	4	Text Generation
Qwen2.5-7b	Qwen2ForCausalLM	4 or 8	Text Generation
Qwen2.5-14b	Qwen2ForCausalLM	4 or 8	Text Generation
Qwen2.5-32b	Qwen2ForCausalLM	16	Text Generation
Qwen2.5-72b	Qwen2ForCausalLM	16	Text Generation
Qwen3-1.7b	Qwen3ForCausalLM	2	Text Generation
Qwen3-4b	Qwen3ForCausalLM	4	Text Generation
Qwen3-8b	Qwen3ForCausalLM	4 or 8	Text Generation
Qwen3-VL-2b^†	Qwen3VLForConditionalGeneration	8	Image Captioning
Qwen3-VL-4b^†	Qwen3VLForConditionalGeneration	8	Image Captioning
Qwen3-VL-8b^†	Qwen3VLForConditionalGeneration	8	Image Captioning
Midm-2.0-Mini	LlamaForCausalLM	2 or 4	Text Generation
Midm-2.0-Base	LlamaForCausalLM	4 or 8	Text Generation
Salamandra-7b	LlamaForCausalLM	4 or 8	Text Generation
KONI-Llama3.1-8b	LlamaForCausalLM	8	Text Generation
EXAONE-3.0-7.8b	ExaoneForCausalLM	4 or 8	Text Generation
EXAONE-3.5-2.4b	ExaoneForCausalLM	4	Text Generation
EXAONE-3.5-7.8b	ExaoneForCausalLM	4 or 8	Text Generation
EXAONE-3.5-32b	ExaoneForCausalLM	8 or 16	Text Generation
OPT-6.7b	OPTForCausalLM	4	Text Generation
SOLAR-10.7b	LlamaForCausalLM	4 or 8	Text Generation
EEVE-Korean-10.8b	LlamaForCausalLM	4 or 8	Text Generation
T5-11b	T5ForConditionalGeneration	2 or 4	Text Generation
T5-Enc-11b	T5EncoderModel	2 or 4	Sentence Similarity
Qwen3-Embedding-4b	Qwen3Model	4	Sentence Similarity
Qwen3-Reranker-4b	Qwen3ForCausalLM	4	Sentence Similarity
Cosmos-Reason1-7B	Qwen2_5_VLForConditionalGeneration	8	Image Captioning
PaliGemma-3b	PaliGemmaForConditionalGeneration	4	Image Captioning
PaliGemma2-3b	PaliGemmaForConditionalGeneration	4	Image Captioning
Gemma3-4b	Gemma3ForConditionalGeneration	8	Image Captioning
Gemma3-12b	Gemma3ForConditionalGeneration	8 or 16	Image Captioning
Gemma3-27b	Gemma3ForConditionalGeneration	16	Image Captioning
Qwen2-VL-7b	Qwen2VLForConditionalGeneration	8	Image Captioning
Qwen2.5-VL-7b	Qwen2_5_VLForConditionalGeneration	8	Image Captioning
Idefics3-8B-Llama3	Idefics3ForConditionalGeneration	8	Image Captioning
Llava-v1.5-7b	LlavaForConditionalGeneration	4 or 8	Image Captioning
Llava-v1.6-mistral-7b	LlavaNextForConditionalGeneration	4 or 8	Image Captioning
Pixtral-12b	LlavaForConditionalGeneration	8	Image Captioning
BLIP2-6.7b	Blip2ForConditionalGeneration	4	Image Captioning
ColPali-v1.3	ColPaliForRetrieval	4	Visual Document Retrieval
ColQwen2	ColQwen2ForRetrieval	4	Visual Document Retrieval
ColQwen2.5	ColQwen2ForRetrieval	4	Visual Document Retrieval

Diffusers¶

Single NPU¶

Note

Models marked with a superscript, †, require more than one ATOM™ due to their large weight size exceeding the capacity of a single ATOM™. This necessitates dividing the model's modules across multiple ATOM™s. For detailed information regarding the specific module distribution, please refer to the model code.

Model	Model Architecture	Task
Stable Diffusion	StableDiffusionPipeline StableDiffusionImg2ImgPipeline StableDiffusionInpaintPipeline	Text to Image Image to Image Inpainting
Stable Diffusion + LoRA	StableDiffusionPipeline	Text to Image
Stable Diffusion V3^†	StableDiffusion3Pipeline StableDiffusion3Img2ImgPipeline StableDiffusion3InpaintPipeline	Text to Image Image to Image Inpainting
Stable Diffusion XL	StableDiffusionXLPipeline StableDiffusionXLImg2ImgPipeline StableDiffusionXLInpaintPipeline	Text to Image Image to Image Inpainting
Stable Diffusion XL + multi-LoRA	StableDiffusionXLPipeline	Text to Image
SDXL-turbo	StableDiffusionXLPipeline StableDiffusionXLImg2ImgPipeline	Text to Image Image to Image
Stable Diffusion + ControlNet	StableDiffusionControlNetPipeline StableDiffusionControlNetImg2ImgPipeline	Text to Image Image to Image
Stable Diffusion XL + ControlNet	StableDiffusionXLControlNetPipeline StableDiffusionXLControlNetImg2ImgPipeline	Text to Image Image to Image
Kandinsky V2.2	KandinskyV22PriorPipeline KandinskyV22Pipeline KandinskyV22Img2ImgPipeline KandinskyV22InpaintPipeline KandinskyV22CombinedPipeline KandinskyV22Img2ImgCombinedPipeline KandinskyV22InpaintCombinedPipeline	Text to Image Image to Image Inpainting Prior Generation
Stable Video Diffusion	StableVideoDiffusionPipeline	Image to Video

Multi-NPU (RSD)¶

Note

Rebellions Scalable Design (RSD) is available on ATOM™+ (RBLN-CA22) and ATOM™-Max (RBLN-CA25). RBLN-CA22 and RBLN-CA25 have 1 NPU chip and 4 NPU chips on the cards, respectively. You can check the type of your current RBLN NPU using the rbln-smi command.

Note

Models marked with † have large submodules that require distribution across multiple ATOM™ instances. For distribution details, refer to the model's example code.

Model	Model Architecture	Number of NPU chips	Task
Cosmos-Predict1-7B-Text2World^†	CosmosTextToWorldPipeline	4	Text to Video
Cosmos-Predict1-14B-Text2World^†	CosmosTextToWorldPipeline	4	Text to Video
Cosmos-Predict1-7B-Video2World^†	CosmosVideoToWorldPipeline	4	Video to Video
Cosmos-Predict1-14B-Video2World^†	CosmosVideoToWorldPipeline	4	Video to Video
Cosmos-Transfer1-7B^†	-	4	Video to Video
Cosmos-Transfer1-7B-Distilled^†	-	4	Video to Video
Cosmos-Transfer1-7B-Sample-AV^†	-	4	Video to Video
Cosmos-Transfer1-7B-4KUpscaler^†	-	4	Video to Video
Cosmos-Transfer1-7B-Sample-AV-Single2MultiView^†	-	4	Video to Video