Optimum RBLN¶
Optimum RBLN serves as a bridge connecting the HuggingFace transformers
/diffusers
libraries to RBLN NPUs, i.e. ATOM (RBLN-CA02
) and ATOM+ (RBLN-CA12
). It offers a set of tools that enable easy model compilation and inference for both single and multi-NPU (Rebellions Scalable Design) configurations across a range of downstream tasks. The following table presents the comprehensive lineup of models currently supported by Optimum RBLN.
Transformers¶
Single NPU¶
Model | Dataset | Task |
---|---|---|
Phi-2 | 250B tokens, combination of NLP synthetic data created by AIOAI GPT-3.5 | Text Generation |
Gemma-2b | 6 trillion tokens of web, code, and mathematics text | Text Generation |
GPT2 | WebText | Text Generation |
GPT2-medium | WebText | Text Generation |
GPT2-large | WebText | Text Generation |
GPT2-xl | WebText | Text Generation |
T5-small | Colossal Clean Crawled Corpus | Text Generation |
T5-base | Colossal Clean Crawled Corpus | Text Generation |
T5-large | Colossal Clean Crawled Corpus | Text Generation |
T5-3b | Colossal Clean Crawled Corpus | Text Generation |
BART-base | BookCorpus + etc. | Text Generation |
BART-large | BookCorpus + etc. | Text Generation |
KoBART-base | Korean Wiki | Text Generation |
E5-base-4K | Colossal Clean text Pairs | Embedding Retrieval |
BERT-base | - BookCorpus & English Wikipedia - SQuAD v2 |
Masked Langague Modeling |
BERT-large | - BookCorpus & English Wikipedia - SQuAD v2 |
Masked Langague Modeling |
DistilBERT-base | - BookCorpus & English Wikipedia - SQuAD v2 |
Question Answering |
SecureBERT | a manually crafted dataset from the human readable descriptions of MITRE ATT&CK techniques and tactics | Masked Langague Modeling |
RoBERTa | a manually crafted dataset from the human readable descriptions of MITRE ATT&CK techniques and tactics | Text Classification |
BGE-M3 | MLDR and bge-m3-data | Embedding Retrieval |
BGE-Reranker-V2-M3 | MLDR and bge-m3-data | Embedding Retrieval |
BGE-Reranker-Base | MLDR and bge-m3-data | Embedding Retrieval |
BGE-Reranker-Large | MLDR and bge-m3-data | Embedding Retrieval |
Whisper-tiny | 680k hours of labeled data from the web | Speech to Text |
Whisper-base | 680k hours of labeled data from the web | Speech to Text |
Whisper-small | 680k hours of labeled data from the web | Speech to Text |
Whisper-medium | 680k hours of labeled data from the web | Speech to Text |
Whisper-large-v3 | 680k hours of labeled data from the web | Speech to Text |
Whisper-large-v3-turbo | 680k hours of labeled data from the web | Speech to Text |
Wav2Vec2 | Librispeech | Speech to Text |
Audio-Spectogram-Transformer | AudioSet | Audio Classification |
DPT-large | MIX 6 | Monocular Depth Estimation |
ViT-large | ImageNet-21k & ImageNet | Image Classification |
ResNet50 | ILSVRC2012 | Image Classification |
Multi-NPU (RSD)¶
Note
Rebellions Scalable Design (RSD) is only available on ATOM+ (RBLN-CA12
). You can check the type of your current RBLN NPU using the rbln-stat
command.
Model | Dataset | Recommended # of NPUs | Task |
---|---|---|---|
Llama3-8b | A new mix of publicly available online data | 4 | Text Generation |
Llama3-8b + LoRA | fingpt-forecaster-dow30-202305-202405 | 4 | Text Generation |
Llama2-7b | A new mix of publicly available online data | 4 | Text Generation |
Llama2-13b | A new mix of publicly available online data | 8 | Text Generation |
Gemma-7b | 6 trillion tokens of web, code, and mathematics text | 4 | Text Generation |
Mistral-7b | Publicly available online data | 4 | Text Generation |
Qwen2-7b | 7T tokens of internal data | 4 | Text Generation |
Qwen2.5-7b | 18T tokens of internal data | 4 | Text Generation |
Qwen2.5-14b | 18T tokens of internal data | 8 | Text Generation |
Salamandra-7b | 2.4T tokens of 35 European languages and 92 programming languages | 4 | Text Generation |
EXAONE-3.0-7.8b | 8T tokens of curated English and Korean data | 4 | Text Generation |
EXAONE-3.5-2.4b | 6.5T tokens of curated English and Korean data | 4 | Text Generation |
EXAONE-3.5-7.8b | 6.5T tokens of curated English and Korean data | 8 | Text Generation |
Mi:dm-7b | AI-HUB/the National Institute of Korean Language | 4 | Text Generation |
SOLAR-10.7b | alpaca-gpt4-data + etc. | 8 | Text Generation |
EEVE-Korean-10.8b | Korean-translated ver. of Open-Orca/SlimOrca-Dedup and argilla/ultrafeedback-binarized-preferences-cleaned | 8 | Text Generation |
Llava-v1.6-mistral-7b | - | 4 | Image Captioning |
Diffusers¶
Note
Models marked with a superscript, †
, require more than one ATOM due to their large weight size exceeding the capacity of a single ATOM. This necessitates dividing the model's modules across multiple ATOMs. For detailed information regarding the specific module distribution, please refer to the model code.
Model | Dataset | Task |
---|---|---|
Stable Diffusion | - | |
Stable Diffusion + LoRA | - | Text to Image |
Stable Diffusion V3† | - | |
Stable Diffusion XL | - | |
Stable Diffusion XL + multi-LoRA | - | Text to Image |
SDXL-turbo | - | |
Stable Diffusion + ControlNet | - | |
Stable Diffusion XL + ControlNet | - |