Optimum RBLN¶
Optimum RBLN은 허깅페이스 transformers와 diffusers 모델들이 RBLN NPU, 즉 ATOM™ (RBLN-CA02), ATOM™+ (RBLN-CA12, RBLN-CA22), ATOM™-Max (RBLN-CA25)에서 실행 될 수 있도록 연결해주는 라이브러리 입니다. 이를 통해 허깅페이스 모델들의 다양한 다운스트림 태스크들을 단일 및 다중 NPU (Rebellions Scalable Design)에서 손쉽게 컴파일 및 추론 할 수 있습니다. 다음 표에 현재 Optimum RBLN이 지원하는 모델 목록이 나열되어 있습니다.
Transformers¶
Single NPU¶
| Model | Model Architecture | Task |
|---|---|---|
| Phi-2 | PhiForCausalLM | Text Generation |
| Gemma-2b | GemmaForCausalLM | Text Generation |
| OPT-2.7b | OPTForCausalLM | Text Generation |
| Qwen2.5-0.5b | Qwen2ForCausalLM | Text Generation |
| Qwen3-0.6b | Qwen3ForCausalLM | Text Generation |
| GPT2 | GPT2LMHeadModel | Text Generation |
| GPT2-medium | GPT2LMHeadModel | Text Generation |
| GPT2-large | GPT2LMHeadModel | Text Generation |
| GPT2-xl | GPT2LMHeadModel | Text Generation |
| T5-small | T5ForConditionalGeneration | Text Generation |
| T5-base | T5ForConditionalGeneration | Text Generation |
| T5-large | T5ForConditionalGeneration | Text Generation |
| T5-3b | T5ForConditionalGeneration | Text Generation |
| BART-base | BartForConditionalGeneration | Text Generation |
| BART-large | BartForConditionalGeneration | Text Generation |
| KoBART-base | BartForConditionalGeneration | Text Generation |
| Pegasus | PegasusForConditionalGeneration | Text Generation |
| BERT-base | - BertForMaskedLM - BertForQuestionAnswering |
Masked Language ModelingQuestion Answering |
| BERT-large | - BertForMaskedLM - BertForQuestionAnswering |
Masked Language ModelingQuestion Answering |
| DistilBERT-base | DistilBertForQuestionAnswering | Question Answering |
| SecureBERT | RobertaForMaskedLM | Masked Language Modeling |
| RoBERTa | RobertaForSequenceClassification | Text Classification |
| Qwen3-Embedding-0.6b | Qwen3Model | Sentence Similarity |
| Qwen3-Reranker-0.6b | Qwen3ForCausalLM | Sentence Similarity |
| E5-base-4K | BertModel | Sentence Similarity |
| LaBSE | BertModel | Sentence Similarity |
| KR-SBERT-V40K-klueNLI-augSTS | BertModel | Sentence Similarity |
| BGE-Small-EN-v1.5 | RBLNBertModel | Sentence Similarity |
| BGE-Base-EN-v1.5 | RBLNBertModel | Sentence Similarity |
| BGE-Large-EN-v1.5 | RBLNBertModel | Sentence Similarity |
| BGE-M3/Dense-Embedding | XLMRobertaModel | Sentence Similarity |
| BGE-M3/Multi-Vector | XLMRobertaModel | Sentence Similarity |
| BGE-M3/Sparse-Embedding | XLMRobertaModel | Sentence Similarity |
| BGE-Reranker-V2-M3 | XLMRobertaForSequenceClassification | Sentence Similarity |
| BGE-Reranker-Base | XLMRobertaForSequenceClassification | Sentence Similarity |
| BGE-Reranker-Large | XLMRobertaForSequenceClassification | Sentence Similarity |
| Ko-Reranker | XLMRobertaForSequenceClassification | Sentence Similarity |
| Time-Series-Transformer | TimeSeriesTransformerForPrediction | Time-series Forecasting |
| BLIP2-2.7b | RBLNBlip2ForConditionalGeneration | Image Captioning |
| Whisper-tiny | WhisperForConditionalGeneration | Speech to Text |
| Whisper-base | WhisperForConditionalGeneration | Speech to Text |
| Whisper-small | WhisperForConditionalGeneration | Speech to Text |
| Whisper-medium | WhisperForConditionalGeneration | Speech to Text |
| Whisper-large-v3 | WhisperForConditionalGeneration | Speech to Text |
| Whisper-large-v3-turbo | WhisperForConditionalGeneration | Speech to Text |
| Wav2Vec2 | Wav2Vec2ForCTC | Speech to Text |
| Audio-Spectogram-Transformer | ASTForAudioClassification | Audio Classification |
| GroundingDino-Tiny | GroundingDinoForObjectDetection | Multi Modal |
| GroundingDino-Base | GroundingDinoForObjectDetection | Multi Modal |
| Depth-Anything-V2-Small | DepthAnythingForDepthEstimation | Monocular Depth Estimation |
| Depth-Anything-V2-Base | DepthAnythingForDepthEstimation | Monocular Depth Estimation |
| Depth-Anything-V2-Large | DepthAnythingForDepthEstimation | Monocular Depth Estimation |
| DPT-large | DPTForDepthEstimation | Monocular Depth Estimation |
| ViT-large | ViTForImageClassification | Image Classification |
| ResNet50 | ResNetForImageClassification | Image Classification |
Multi-NPU (RSD)¶
Note
다중 NPU 기능은 ATOM™+ (RBLN-CA12, RBLN-CA22)와 ATOM™-Max (RBLN-CA25)에서만 지원됩니다. 지금 사용 중인 NPU의 종류는 rbln-stat 명령어로 확인할 수 있습니다.
| Model | Model Architecture | Recommended # of NPUs | Task |
|---|---|---|---|
| DeepSeek-R1-Distill-Llama-8b | LlamaForCausalLM | 8 | Text Generation |
| DeepSeek-R1-Distill-Llama-70b | LlamaForCausalLM | 16 | Text Generation |
| DeepSeek-R1-Distill-Qwen-1.5b | Qwen2ForCausalLM | 8 | Text Generation |
| DeepSeek-R1-Distill-Qwen-7b | Qwen2ForCausalLM | 8 | Text Generation |
| DeepSeek-R1-Distill-Qwen-14b | Qwen2ForCausalLM | 8 | Text Generation |
| DeepSeek-R1-Distill-Qwen-32b | Qwen2ForCausalLM | 16 | Text Generation |
| Llama3.3-70b | LlamaForCausalLM | 16 | Text Generation |
| Llama3.2-3b | LlamaForCausalLM | 8 | Text Generation |
| Llama3.1-70b | LlamaForCausalLM | 16 | Text Generation |
| Llama3.1-8b | LlamaForCausalLM | 8 | Text Generation |
| Llama3-8b | LlamaForCausalLM | 4 or 8 | Text Generation |
| Llama3-8b + LoRA | LlamaForCausalLM | 4 or 8 | Text Generation |
| Llama2-7b | LlamaForCausalLM | 4 or 8 | Text Generation |
| Llama2-13b | LlamaForCausalLM | 4 or 8 | Text Generation |
| Gemma-7b | GemmaForCausalLM | 4 or 8 | Text Generation |
| Mistral-7b | MistralForCausalLM | 4 or 8 | Text Generation |
| A.X-4.0-Light | Qwen2ForCausalLM | 4 or 8 | Text Generation |
| Qwen2-7b | Qwen2ForCausalLM | 4 or 8 | Text Generation |
| Qwen2.5-1.5b | Qwen2ForCausalLM | 2 | Text Generation |
| Qwen2.5-3b | Qwen2ForCausalLM | 4 | Text Generation |
| Qwen2.5-7b | Qwen2ForCausalLM | 4 or 8 | Text Generation |
| Qwen2.5-14b | Qwen2ForCausalLM | 4 or 8 | Text Generation |
| Qwen2.5-32b | Qwen2ForCausalLM | 16 | Text Generation |
| Qwen2.5-72b | Qwen2ForCausalLM | 16 | Text Generation |
| Qwen3-1.7b | Qwen3ForCausalLM | 2 | Text Generation |
| Qwen3-4b | Qwen3ForCausalLM | 4 | Text Generation |
| Qwen3-8b | Qwen3ForCausalLM | 4 or 8 | Text Generation |
| Qwen3-32b | Qwen3ForCausalLM | 16 | Text Generation |
| Midm-2.0-Mini | LlamaForCausalLM | 2 or 4 | Text Generation |
| Midm-2.0-Base | LlamaForCausalLM | 4 or 8 | Text Generation |
| Salamandra-7b | LlamaForCausalLM | 4 or 8 | Text Generation |
| KONI-Llama3.1-8b | LlamaForCausalLM | 8 | Text Generation |
| EXAONE-3.0-7.8b | ExaoneForCausalLM | 4 or 8 | Text Generation |
| EXAONE-3.5-2.4b | ExaoneForCausalLM | 4 | Text Generation |
| EXAONE-3.5-7.8b | ExaoneForCausalLM | 4 or 8 | Text Generation |
| EXAONE-3.5-32b | ExaoneForCausalLM | 8 or 16 | Text Generation |
| OPT-6.7b | OPTForCausalLM | 4 | Text Generation |
| SOLAR-10.7b | LlamaForCausalLM | 4 or 8 | Text Generation |
| EEVE-Korean-10.8b | LlamaForCausalLM | 4 or 8 | Text Generation |
| T5-11b | T5ForConditionalGeneration | 2 or 4 | Text Generation |
| T5-Enc-11b | T5EncoderModel | 2 or 4 | Sentence Similarity |
| Qwen3-Embedding-4b | Qwen3Model | 4 | Sentence Similarity |
| Qwen3-Reranker-4b | Qwen3ForCausalLM | 4 | Sentence Similarity |
| Gemma3-4b | Gemma3ForConditionalGeneration | 8 | Image Captioning |
| Gemma3-12b | Gemma3ForConditionalGeneration | 8 or 16 | Image Captioning |
| Gemma3-27b | Gemma3ForConditionalGeneration | 16 | Image Captioning |
| Qwen2-VL-7b | Qwen2VLForConditionalGeneration | 8 | Image Captioning |
| Qwen2.5-VL-7b | Qwen2_5_VLForConditionalGeneration | 8 | Image Captioning |
| Idefics3-8B-Llama3 | Idefics3ForConditionalGeneration | 8 | Image Captioning |
| Llava-v1.5-7b | LlavaForConditionalGeneration | 4 or 8 | Image Captioning |
| Llava-v1.6-mistral-7b | LlavaNextForConditionalGeneration | 4 or 8 | Image Captioning |
| Pixtral-12b | LlavaForConditionalGeneration | 8 | Image Captioning |
| BLIP2-6.7b | RBLNBlip2ForConditionalGeneration | 4 | Image Captioning |
| ColPali-v1.3 | RBLNColPaliForRetrieval | 4 | Visual Document Retrieval |
Diffusers¶
Single NPU¶
Note
윗 첨자 †가 표시된 모델은 크기가 커서 ATOM™ 하나에 담을 수 없습니다. 따라서 모델의 모듈들을 여러 개의 ATOM™에 나누어 올려야합니다. 모델 모듈의 구체적인 분산 정보는 모델 예제 코드를 참조하십시오.
| Model | Model Architecture | Task |
|---|---|---|
| Stable Diffusion |
|
|
| Stable Diffusion + LoRA |
|
|
| Stable Diffusion V3† |
|
|
| Stable Diffusion XL |
|
|
| Stable Diffusion XL + multi-LoRA |
|
|
| SDXL-turbo |
|
|
| Stable Diffusion + ControlNet |
|
|
| Stable Diffusion XL + ControlNet |
|
|
| Kandinsky V2.2 |
|
Multi-NPU (RSD)¶
Note
다중 NPU 기능은 ATOM™+ (RBLN-CA12, RBLN-CA22)와 ATOM™-Max (RBLN-CA25)에서만 지원됩니다. 지금 사용 중인 NPU의 종류는 rbln-stat 명령어로 확인할 수 있습니다.
Note
윗 첨자 †가 표시된 모델은 서브모듈들의 크기가 커서 여러 개의 ATOM™에 나누어 올려야 합니다. 서브모듈의 구체적인 분산 정보는 모델 예제 코드를 참조하십시오.
| Model | Model Architecture | Recommended # of NPUs | Task |
|---|---|---|---|
| Cosmos-Predict1-7B-Text2World† | CosmosTextToWorldPipeline | 4 | Text to Video |
| Cosmos-Predict1-14B-Text2World† | CosmosTextToWorldPipeline | 4 | Text to Video |
| Cosmos-Predict1-7B-Video2World† | CosmosVideoToWorldPipeline | 4 | Video to Video |
| Cosmos-Predict1-14B-Video2World† | CosmosVideoToWorldPipeline | 4 | Video to Video |
| Cosmos-Transfer1-7B† | - | 4 | Video to Video |
| Cosmos-Transfer1-7B-Distilled† | - | 4 | Video to Video |
| Cosmos-Transfer1-7B-Sample-AV† | - | 4 | Video to Video |
| Cosmos-Transfer1-7B-4KUpscaler† | - | 4 | Video to Video |
| Cosmos-Transfer1-7B-Sample-AV-Single2MultiView† | - | 4 | Video to Video |