Optimum RBLN
Optimum RBLN은 허깅페이스 transformers
와 diffusers
모델들이 RBLN NPU, 즉 ATOM (RBLN-CA02
) 및 ATOM+ (RBLN-CA12
)에서 실행 될 수 있도록 연결해주는 라이브러리 입니다. 이를 통해 허깅페이스 모델들의 다양한 다운스트림 태스크들을 단일 및 다중 NPU (Rebellions Scalable Design)에서 손쉽게 컴파일 및 추론 할 수 있습니다. 다음 표에 현재 Optimum RBLN이 지원하는 모델 목록이 나열되어 있습니다.
Single NPU
| Model | Model Architecture | Task |
| :---: | :-----------------------------: | :-----: |
| Phi-2 | PhiForCausalLM | |
| Gemma-2b | GemmaForCausalLM | |
| GPT2 | GPT2LMHeadModel | |
| GPT2-medium | GPT2LMHeadModel | |
| GPT2-large | GPT2LMHeadModel | |
| GPT2-xl | GPT2LMHeadModel | |
| T5-small | T5ForConditionalGeneration | |
| T5-base | T5ForConditionalGeneration | |
| T5-large | T5ForConditionalGeneration | |
| T5-3b | T5ForConditionalGeneration | |
| BART-base | BartForConditionalGeneration | |
| BART-large | BartForConditionalGeneration | |
| KoBART-base | BartForConditionalGeneration | |
| E5-base-4K | BertModel | |
| LaBSE | BertModel | |
| KR-SBERT-V40K-klueNLI-augSTS | BertModel | |
| BERT-base | - BertForMaskedLM
- BertForQuestionAnswering | |
| BERT-large | - BertForMaskedLM
- BertForQuestionAnswering | |
| DistilBERT-base | DistilBertForQuestionAnswering | |
| SecureBERT | RobertaForMaskedLM | |
| RoBERTa | RobertaForSequenceClassification | |
| BGE-M3 | XLMRobertaModel | |
| BGE-Reranker-V2-M3 | XLMRobertaForSequenceClassification | |
| BGE-Reranker-Base | XLMRobertaForSequenceClassification | |
| BGE-Reranker-Large | XLMRobertaForSequenceClassification | |
| Ko-Reranker | XLMRobertaForSequenceClassification | |
| Whisper-tiny | WhisperForConditionalGeneration | |
| Whisper-base | WhisperForConditionalGeneration | |
| Whisper-small | WhisperForConditionalGeneration | |
| Whisper-medium | WhisperForConditionalGeneration | |
| Whisper-large-v3 | WhisperForConditionalGeneration | |
| Whisper-large-v3-turbo | WhisperForConditionalGeneration | |
| Wav2Vec2 | Wav2Vec2ForCTC | |
| Audio-Spectogram-Transformer | ASTForAudioClassification | |
| DPT-large | DPTForDepthEstimation | |
| ViT-large | ViTForImageClassification | |
| ResNet50 | ResNetForImageClassification | |
Multi-NPU
Note
다중 NPU 기능은 ATOM+ (RBLN-CA12
)에서만 지원됩니다. 지금 사용 중인 NPU의 종류는 rbln-stat
명령어로 확인할 수 있습니다.
| Model | Model Architecture | Recommended # of NPUs | Task |
| :---: | :-----------------------------: | :-----: | :-----: |
| DeepSeek-R1-Distill-Llama-8b | LlamaForCausalLM | 8 | |
| DeepSeek-R1-Distill-Llama-70b | LlamaForCausalLM | 16 | |
| DeepSeek-R1-Distill-Qwen-1.5b | Qwen2ForCausalLM | 8 | |
| DeepSeek-R1-Distill-Qwen-7b | Qwen2ForCausalLM | 8 | |
| DeepSeek-R1-Distill-Qwen-14b | Qwen2ForCausalLM | 8 | |
| DeepSeek-R1-Distill-Qwen-32b | Qwen2ForCausalLM | 16 | |
| Llama3.3-70b | LlamaForCausalLM | 16 | |
| Llama3.2-3b | LlamaForCausalLM | 8 | |
| Llama3.1-70b | LlamaForCausalLM | 16 | |
| Llama3.1-8b | LlamaForCausalLM | 8 | |
| Llama3-8b | LlamaForCausalLM | 4 | |
| Llama3-8b + LoRA | LlamaForCausalLM | 4 | |
| Llama2-7b | LlamaForCausalLM | 4 | |
| Llama2-13b | LlamaForCausalLM | 8 | |
| Gemma-7b | GemmaForCausalLM | 4 | |
| Mistral-7b | MistralForCausalLM | 4 | |
| Qwen2-7b | Qwen2ForCausalLM | 4 | |
| Qwen2.5-7b | Qwen2ForCausalLM | 4 | |
| Qwen2.5-14b | Qwen2ForCausalLM | 8 | |
| Salamandra-7b | LlamaForCausalLM | 4 | |
| KONI-Llama3.1-8b | LlamaForCausalLM | 8 | |
| EXAONE-3.0-7.8b | ExaoneForCausalLM | 4 | |
| EXAONE-3.5-2.4b | ExaoneForCausalLM | 4 | |
| EXAONE-3.5-7.8b | ExaoneForCausalLM | 8 | |
| Mi:dm-7b | MidmLMHeadModel | 4 | |
| SOLAR-10.7b | LlamaForCausalLM | 8 | |
| EEVE-Korean-10.8b | LlamaForCausalLM | 8 | |
| Llava-v1.6-mistral-7b | LlavaNextForConditionalGeneration | 4 | |
Diffusers
Note
윗 첨자 †
가 표시된 모델은 크기가 커서 ATOM 하나에 담을 수 없습니다. 따라서 모델의 모듈들을 여러 개의 ATOM에 나누어 올려야합니다. 모델 모듈의 구체적인 분산 정보는 모델 예제 코드를 참조하십시오.
| Model | Model Architecture | Task |
| :---: | :-----------------------------: | :-----: |
| Stable Diffusion | - StableDiffusionPipeline
- StableDiffusionImg2ImgPipeline
- StableDiffusionInpaintPipeline
| |
| Stable Diffusion + LoRA | | |
| Stable Diffusion V3† | - StableDiffusion3Pipeline
- StableDiffusion3Img2ImgPipeline
- StableDiffusion3InpaintPipeline
| |
| Stable Diffusion XL | - StableDiffusionXLPipeline
- StableDiffusionXLImg2ImgPipeline
- StableDiffusionXLInpaintPipeline
| |
| Stable Diffusion XL + multi-LoRA | - StableDiffusionXLPipeline
| |
| SDXL-turbo | - StableDiffusionXLPipeline
- StableDiffusionXLImg2ImgPipeline
| |
| Stable Diffusion + ControlNet | - StableDiffusionControlNetPipeline
- StableDiffusionControlNetImg2ImgPipeline
| |
| Stable Diffusion XL + ControlNet | - StableDiffusionXLControlNetPipeline
- StableDiffusionXLControlNetImg2ImgPipeline
| |
| Kandinsky V2.2 | - KandinskyV22InpaintCombinedPipeline
- KandinskyV22PriorPipeline
- KandinskyV22InpaintPipeline
| |