콘텐츠로 이동

Optimum RBLN

Optimum RBLN은 허깅페이스 transformersdiffusers 모델들이 RBLN NPU, 즉 ATOM™ (RBLN-CA02), ATOM™+ (RBLN-CA12, RBLN-CA22), ATOM™-Max (RBLN-CA25)에서 실행 될 수 있도록 연결해주는 라이브러리 입니다. 이를 통해 허깅페이스 모델들의 다양한 다운스트림 태스크들을 단일 및 다중 NPU (Rebellions Scalable Design)에서 손쉽게 컴파일 및 추론 할 수 있습니다. 다음 표에 현재 Optimum RBLN이 지원하는 모델 목록이 나열되어 있습니다.

Transformers

Single NPU

Model Model Architecture Task
Phi-2 PhiForCausalLM Text Generation
Gemma-2b GemmaForCausalLM Text Generation
OPT-2.7b OPTForCausalLM Text Generation
GPT2 GPT2LMHeadModel Text Generation
GPT2-medium GPT2LMHeadModel Text Generation
GPT2-large GPT2LMHeadModel Text Generation
GPT2-xl GPT2LMHeadModel Text Generation
T5-small T5ForConditionalGeneration Text Generation
T5-base T5ForConditionalGeneration Text Generation
T5-large T5ForConditionalGeneration Text Generation
T5-3b T5ForConditionalGeneration Text Generation
BART-base BartForConditionalGeneration Text Generation
BART-large BartForConditionalGeneration Text Generation
KoBART-base BartForConditionalGeneration Text Generation
E5-base-4K BertModel Sentence Similarity
LaBSE BertModel Sentence Similarity
KR-SBERT-V40K-klueNLI-augSTS BertModel Sentence Similarity
BERT-base - BertForMaskedLM
- BertForQuestionAnswering
Masked Language ModelingQuestion Answering
BERT-large - BertForMaskedLM
- BertForQuestionAnswering
Masked Language ModelingQuestion Answering
DistilBERT-base DistilBertForQuestionAnswering Question Answering
SecureBERT RobertaForMaskedLM Masked Language Modeling
RoBERTa RobertaForSequenceClassification Text Classification
BGE-Small-EN-v1.5 RBLNBertModel Sentence Similarity
BGE-Base-EN-v1.5 RBLNBertModel Sentence Similarity
BGE-Large-EN-v1.5 RBLNBertModel Sentence Similarity
BGE-M3 XLMRobertaModel Sentence Similarity
BGE-Reranker-V2-M3 XLMRobertaForSequenceClassification Sentence Similarity
BGE-Reranker-Base XLMRobertaForSequenceClassification Sentence Similarity
BGE-Reranker-Large XLMRobertaForSequenceClassification Sentence Similarity
Ko-Reranker XLMRobertaForSequenceClassification Sentence Similarity
Time-Series-Transformer TimeSeriesTransformerForPrediction Time-series Forecasting
BLIP2-2.7b RBLNBlip2ForConditionalGeneration Image Captioning
Whisper-tiny WhisperForConditionalGeneration Speech to Text
Whisper-base WhisperForConditionalGeneration Speech to Text
Whisper-small WhisperForConditionalGeneration Speech to Text
Whisper-medium WhisperForConditionalGeneration Speech to Text
Whisper-large-v3 WhisperForConditionalGeneration Speech to Text
Whisper-large-v3-turbo WhisperForConditionalGeneration Speech to Text
Wav2Vec2 Wav2Vec2ForCTC Speech to Text
Audio-Spectogram-Transformer ASTForAudioClassification Audio Classification
DPT-large DPTForDepthEstimation Monocular Depth Estimation
ViT-large ViTForImageClassification Image Classification
ResNet50 ResNetForImageClassification Image Classification

Multi-NPU (RSD)

Note

다중 NPU 기능은 ATOM™+ (RBLN-CA12, RBLN-CA22)와 ATOM™-Max (RBLN-CA25)에서만 지원됩니다. 지금 사용 중인 NPU의 종류는 rbln-stat 명령어로 확인할 수 있습니다.

Model Model Architecture Recommended # of NPUs Task
DeepSeek-R1-Distill-Llama-8b LlamaForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Llama-70b LlamaForCausalLM 16 Text Generation
DeepSeek-R1-Distill-Qwen-1.5b Qwen2ForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Qwen-7b Qwen2ForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Qwen-14b Qwen2ForCausalLM 8 Text Generation
DeepSeek-R1-Distill-Qwen-32b Qwen2ForCausalLM 16 Text Generation
Llama3.3-70b LlamaForCausalLM 16 Text Generation
Llama3.2-3b LlamaForCausalLM 8 Text Generation
Llama3.1-70b LlamaForCausalLM 16 Text Generation
Llama3.1-8b LlamaForCausalLM 8 Text Generation
Llama3-8b LlamaForCausalLM 4 or 8 Text Generation
Llama3-8b + LoRA LlamaForCausalLM 4 or 8 Text Generation
Llama2-7b LlamaForCausalLM 4 or 8 Text Generation
Llama2-13b LlamaForCausalLM 4 or 8 Text Generation
Gemma-7b GemmaForCausalLM 4 or 8 Text Generation
Mistral-7b MistralForCausalLM 4 or 8 Text Generation
A.X-4.0-Light Qwen2ForCausalLM 4 or 8 Text Generation
Qwen2-7b Qwen2ForCausalLM 4 or 8 Text Generation
Qwen2.5-7b Qwen2ForCausalLM 4 or 8 Text Generation
Qwen2.5-14b Qwen2ForCausalLM 4 or 8 Text Generation
Midm-2.0-Mini LlamaForCausalLM 2 or 4 Text Generation
Midm-2.0-Base LlamaForCausalLM 4 or 8 Text Generation
Salamandra-7b LlamaForCausalLM 4 or 8 Text Generation
KONI-Llama3.1-8b LlamaForCausalLM 8 Text Generation
EXAONE-3.0-7.8b ExaoneForCausalLM 4 or 8 Text Generation
EXAONE-3.5-2.4b ExaoneForCausalLM 4 Text Generation
EXAONE-3.5-7.8b ExaoneForCausalLM 4 or 8 Text Generation
EXAONE-3.5-32b ExaoneForCausalLM 8 or 16 Text Generation
OPT-6.7b OPTForCausalLM 4 Text Generation
SOLAR-10.7b LlamaForCausalLM 4 or 8 Text Generation
EEVE-Korean-10.8b LlamaForCausalLM 4 or 8 Text Generation
T5-11b T5ForConditionalGeneration 2 or 4 Text Generation
T5-Enc-11b T5EncoderModel 2 or 4 Sentence Similarity
Gemma3-4b Gemma3ForConditionalGeneration 8 Image Captioning
Gemma3-12b Gemma3ForConditionalGeneration 8 or 16 Image Captioning
Gemma3-27b Gemma3ForConditionalGeneration 16 Image Captioning
Qwen2.5-VL-7b Qwen2_5_VLForConditionalGeneration 8 Image Captioning
Idefics3-8B-Llama3 Idefics3ForConditionalGeneration 8 Image Captioning
Llava-v1.6-mistral-7b LlavaNextForConditionalGeneration 4 or 8 Image Captioning
BLIP2-6.7b RBLNBlip2ForConditionalGeneration 4 Image Captioning
ColPali-v1.3 RBLNColPaliForRetrieval 4 Visual Document Retrieval

Diffusers

Single NPU

Note

윗 첨자 가 표시된 모델은 크기가 커서 ATOM™ 하나에 담을 수 없습니다. 따라서 모델의 모듈들을 여러 개의 ATOM™에 나누어 올려야합니다. 모델 모듈의 구체적인 분산 정보는 모델 예제 코드를 참조하십시오.

Model Model Architecture Task
Stable Diffusion
  • StableDiffusionPipeline
  • StableDiffusionImg2ImgPipeline
  • StableDiffusionInpaintPipeline
Stable Diffusion + LoRA
  • StableDiffusionPipeline
Stable Diffusion V3
  • StableDiffusion3Pipeline
  • StableDiffusion3Img2ImgPipeline
  • StableDiffusion3InpaintPipeline
Stable Diffusion XL
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
  • StableDiffusionXLInpaintPipeline
Stable Diffusion XL + multi-LoRA
  • StableDiffusionXLPipeline
SDXL-turbo
  • StableDiffusionXLPipeline
  • StableDiffusionXLImg2ImgPipeline
Stable Diffusion + ControlNet
  • StableDiffusionControlNetPipeline
  • StableDiffusionControlNetImg2ImgPipeline
Stable Diffusion XL + ControlNet
  • StableDiffusionXLControlNetPipeline
  • StableDiffusionXLControlNetImg2ImgPipeline
Kandinsky V2.2
  • KandinskyV22PriorPipeline
  • KandinskyV22Pipeline
  • KandinskyV22Img2ImgPipeline
  • KandinskyV22InpaintPipeline
  • KandinskyV22CombinedPipeline
  • KandinskyV22Img2ImgCombinedPipeline
  • KandinskyV22InpaintCombinedPipeline

Multi-NPU (RSD)

Note

다중 NPU 기능은 ATOM™+ (RBLN-CA12, RBLN-CA22)와 ATOM™-Max (RBLN-CA25)에서만 지원됩니다. 지금 사용 중인 NPU의 종류는 rbln-stat 명령어로 확인할 수 있습니다.

Note

윗 첨자 가 표시된 모델은 서브모듈들의 크기가 커서 여러 개의 ATOM™에 나누어 올려야 합니다. 서브모듈의 구체적인 분산 정보는 모델 예제 코드를 참조하십시오.

Model Model Architecture Recommended # of NPUs Task
Cosmos-Predict1-7B-Text2World CosmosTextToWorldPipeline 4 Text to Video
Cosmos-Predict1-14B-Text2World CosmosTextToWorldPipeline 4 Text to Video
Cosmos-Predict1-7B-Video2World CosmosVideoToWorldPipeline 4 Video to Video
Cosmos-Predict1-14B-Video2World CosmosVideoToWorldPipeline 4 Video to Video