Skip to content

Model Zoo - PyTorch

The RBLN PyTorch model zoo offers a wide variety of neural network models designed to run on the RBLN NPU. The number of models covered by the RBLN model zoo will continuously expand as the RBLN SDK is upated. You can access full list of the models in the RBLN Model Zoo GitHub repository.

Supported models

Here is the full list of the models covered by the RBLN PyTorch model zoo as of today.

Model Dataset Task
Stable Diffusion v1.5 LAION-2B
Stable Diffusion XL -
SDXL-turbo -
ControlNet -
Mi:dm-7b AI-HUB/the National Institute of Korean Language Text Generation
Llama3-8b A new mix of publicly available online data Text Generation
Llama2-7b A new mix of publicly available online data Text Generation
Llama2-13b A new mix of publicly available online data Text Generation
GPT2 WebText Text Generation
GPT2-medium WebText Text Generation
GPT2-large WebText Text Generation
GPT2-xl WebText Text Generation
SOLAR-10.7b alpaca-gpt4-data + etc. Text Generation
EEVE-Korean-10.8b Korean-translated ver. of Open-Orca/SlimOrca-Dedup and argilla/ultrafeedback-binarized-preferences-cleaned Text Generation
Mistral-7b Publicly available online data Text Generation
Phi-2 Publicly available online data Text Generation
T5-small Colossal Clean Crawled Corpus Text Generation
T5-base Colossal Clean Crawled Corpus Text Generation
T5-large Colossal Clean Crawled Corpus Text Generation
T5-3b Colossal Clean Crawled Corpus Text Generation
BART-base BookCorpus + etc. Text Generation
BART-large BookCorpus + etc. Text Generation
BERT-base - BookCorpus & English Wikipedia
- SQuAD v2
BERT-large - BookCorpus & English Wikipedia
- SQuAD v2
BLIP2-2.7b LAION Image Captioning
Whisper-tiny 680k hours labeld data from the web Speech to Text
Whisper-base 680k hours labeld data from the web Speech to Text
Whisper-small 680k hours labeld data from the web Speech to Text
Whisper-medium 680k hours labeld data from the web Speech to Text
Whisper-large 680k hours labeld data from the web Speech to Text
Wav2Vec2 Librispeech Speech to Text
ConvTasNet WSJ Speech Separation
Audio-Spectogram-Transformer AudioSet Audio Classification
Time-Series-Transformer tourism-monthly dataset Time-series Forecasting
DeepLabV3_ResNet50 ILSVRC2012 Semantic Segmentation
DeepLabV3_ResNet101 ILSVRC2012 Semantic Segmentation
DeepLabV3_MobileNetV3_Large ILSVRC2012 Semantic Segmentation
FCN_ResNet50 ILSVRC2012 Semantic Segmentation
FCN_ResNet101 ILSVRC2012 Semantic Segmentation
UNet Carvana Semantic Segmentation
DeiT-tiny ILSVRC2012 Image Classification
DeiT-tiny distilled ILSVRC2012 Image Classification
DeiT-small ILSVRC2012 Image Classification
DeiT-small distilled ILSVRC2012 Image Classification
DeiT-base ILSVRC2012 Image Classification
DeiT-base distilled ILSVRC2012 Image Classification
DeiT-base 384 ILSVRC2012 Image Classification
DeiT-base distilled 384 ILSVRC2012 Image Classification
R3D_18 KINETICS400_V1 Video Classification
MC3_18 KINETICS400_V1 Video Classification
R(2+1)D_18 KINETICS400_V1 Video Classification
S3D KINETICS400_V1 Video Classification
YOLOv3-tiny COCO Obejct Detection
YOLOv3 COCO Obejct Detection
YOLOv3-spp COCO Obejct Detection
YOLOv4 COCO Obejct Detection
YOLOv4-csp-s-mish COCO Obejct Detection
YOLOv4-csp-x-mish COCO Obejct Detection
YOLOv5n COCO Obejct Detection
YOLOv5s COCO Obejct Detection
YOLOv5m COCO Obejct Detection
YOLOv5l COCO Obejct Detection
YOLOv5x COCO Obejct Detection
YOLOv6s COCO Obejct Detection
YOLOv6n COCO Obejct Detection
YOLOv6m COCO Obejct Detection
YOLOv6l COCO Obejct Detection
YOLOv7-tiny COCO Obejct Detection
YOLOv7 COCO Obejct Detection
YOLOv7x COCO Obejct Detection
YOLOv8n COCO Obejct Detection
YOLOv8s COCO Obejct Detection
YOLOv8n COCO Obejct Detection
YOLOv8m COCO Obejct Detection
YOLOv8l COCO Obejct Detection
YOLOv8x COCO Obejct Detection
YOLOX-nano COCO Obejct Detection
YOLOX-tiny COCO Obejct Detection
YOLOX-s COCO Obejct Detection
YOLOX-m COCO Obejct Detection
YOLOX-l COCO Obejct Detection
YOLOX-x COCO Obejct Detection
YOLOX-darknet53 COCO Obejct Detection
3DDFA_V2 300W-LP Face Alignment
ConvNeXtTiny ILSVRC2012 Image Classification
ConvNeXtSmall ILSVRC2012 Image Classification
ConvNeXtBase ILSVRC2012 Image Classification
ConvNeXtLarge ILSVRC2012 Image Classification
EfficientNetB0 ILSVRC2012 Image Classification
EfficientNetB1 ILSVRC2012 Image Classification
EfficientNetB2 ILSVRC2012 Image Classification
EfficientNetB3 ILSVRC2012 Image Classification
EfficientNetB4 ILSVRC2012 Image Classification
EfficientNetB5 ILSVRC2012 Image Classification
EfficientNetB6 ILSVRC2012 Image Classification
EfficientNetB7 ILSVRC2012 Image Classification
EfficientNet_V2_S ILSVRC2012 Image Classification
EfficientNet_V2_M ILSVRC2012 Image Classification
EfficientNet_V2_L ILSVRC2012 Image Classification
Wide_ResNet50_2 ILSVRC2012 Image Classification
Wide_ResNet101_2 ILSVRC2012 Image Classification
MNASNet0_5 ILSVRC2012 Image Classification
MNASNet0_75 ILSVRC2012 Image Classification
MNASNet1_0 ILSVRC2012 Image Classification
MNASNet1_3 ILSVRC2012 Image Classification
MobileNet_V2 ILSVRC2012 Image Classification
MobileNet_V3_Small ILSVRC2012 Image Classification
MobileNet_V3_Large ILSVRC2012 Image Classification
ResNet18 ILSVRC2012 Image Classification
ResNet34 ILSVRC2012 Image Classification
ResNet50 ILSVRC2012 Image Classification
ResNet101 ILSVRC2012 Image Classification
ResNet152 ILSVRC2012 Image Classification
ResNet101V2 ILSVRC2012 Image Classification
ResNet152V2 ILSVRC2012 Image Classification
VGG11 ILSVRC2012 Image Classification
VGG11_BN ILSVRC2012 Image Classification
VGG13 ILSVRC2012 Image Classification
VGG13_BN ILSVRC2012 Image Classification
VGG16 ILSVRC2012 Image Classification
VGG16_BN ILSVRC2012 Image Classification
VGG19 ILSVRC2012 Image Classification
VGG19_BN ILSVRC2012 Image Classification
SqueezeNet1_0 ILSVRC2012 Image Classification
SqueezeNet1_1 ILSVRC2012 Image Classification
ShuffleNet_V2_X0_5 ILSVRC2012 Image Classification
ShuffleNet_V2_X1_0 ILSVRC2012 Image Classification
ShuffleNet_V2_X1_5 ILSVRC2012 Image Classification
ShuffleNet_V2_X2_0 ILSVRC2012 Image Classification
DenseNet121 ILSVRC2012 Image Classification
DenseNet161 ILSVRC2012 Image Classification
DenseNet169 ILSVRC2012 Image Classification
DenseNet201 ILSVRC2012 Image Classification
RegNet_X_400MF ILSVRC2012 Image Classification
RegNet_X_800MF ILSVRC2012 Image Classification
RegNet_X_1_6GF ILSVRC2012 Image Classification
RegNet_X_3_2GF ILSVRC2012 Image Classification
RegNet_X_8GF ILSVRC2012 Image Classification
RegNet_X_16GF ILSVRC2012 Image Classification
RegNet_X_32GF ILSVRC2012 Image Classification
RegNet_Y_400MF ILSVRC2012 Image Classification
RegNet_Y_800MF ILSVRC2012 Image Classification
RegNet_Y_1_6GF ILSVRC2012 Image Classification
RegNet_Y_3_2GF ILSVRC2012 Image Classification
RegNet_Y_8GF ILSVRC2012 Image Classification
RegNet_Y_16GF ILSVRC2012 Image Classification
RegNet_Y_32GF ILSVRC2012 Image Classification
RegNet_Y_128GF ILSVRC2012 Image Classification
ResNeXt50_32x4D ILSVRC2012 Image Classification
ResNeXt101_32x8D ILSVRC2012 Image Classification
ResNeXt101_64x4D ILSVRC2012 Image Classification
AlexNet ILSVRC2012 Image Classification
GoogLeNet ILSVRC2012 Image Classification
Inception_V3 ILSVRC2012 Image Classification

How to run

We summarized the commands for running the models in the RBLN PyTorch model zoo.

Image Classification (TorchVision)

Here is the run command for TorchVision ResNet18:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/image_classification/torchvision
$ python3 compile.py --model_name resnet18    # Model compile
$ python3 inference.py --model_name resnet18  # Inference

You can try with any of the models listed below by replacing the --model_name field in the above command. Here is the list of the models currently supported by the model zoo:

efficientnet_b0, efficientnet_b1, efficientnet_b2, efficientnet_b3,
efficientnet_b4, efficientnet_b5, efficientnet_b6, efficientnet_b7,
efficientnet_v2_s, efficientnet_v2_m, efficientnet_v2_l,
wide_resnet50_2, wide_resnet101_2,
mnasnet0_5, mnasnet0_75, mnasnet1_0, mnasnet1_3,
mobilenet_v2, mobilenet_v3_large, mobilenet_v3_small,
resnet18, resnet34, resnet50, resnet101, resnet152,
vgg11, vgg13, vgg16, vgg19, vgg11_bn, vgg13_bn, vgg16_bn, vgg19_bn,
squeezenet1_0, squeezenet1_1,
shufflenet_v2_x0_5, shufflenet_v2_x1_0,
shufflenet_v2_x1_5, shufflenet_v2_x2_0,
densenet121, densenet169, densenet161, densenet201,
convnext_tiny, convnext_small, convnext_base, convnext_large,
regnet_y_400mf, regnet_y_800mf, regnet_y_1_6gf, regnet_y_3_2gf,
regnet_y_8gf, regnet_y_16gf, regnet_y_32gf, regnet_y_128gf,
regnet_x_400mf, regnet_x_800mf, regnet_x_1_6gf, regnet_x_3_2gf,
regnet_x_8gf, regnet_x_16gf, regnet_x_32gf,
resnext50_32x4d, resnext101_32x8d, resnext101_64x4d,
alexnet, googlenet, inception_v3

Video Classification (TorchVision)

Here is the run command for TorchVision Video ResNet model r3d_18:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/video_classification/torchvision
$ python3 compile.py --model_name r3d_18    # Model compile
$ python3 inference.py --model_name r3d_18  # Inference

You can test other video classfication models, VideoResNet and VideoS3D, using the below list:

r3d_18, mc3_18, r2plus1d_18, s3d

Object Detection

Here is the run command for the object detection model YOLOX, one of the latest variations of the YOLO series:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/detection/yolox
$ python3 compile.py --model_name yolox-l    # Model compile
$ python3 inference.py --model_name yolox-l  # Inference

The list of the YOLOX variation is:

yolox_nano, yolox_tiny, yolox_s, yolox_m, yolox_l, yolox_x, yolov3

Other YOLO series are also supported as follows:

YOLOv3:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/detection/yolov3
$ python3 compile.py --model_name yolov3-spp    # Model compile
$ python3 inference.py --model_name yolov3-spp  # Inference

The list of the YOLOv3 variation is:

yolov3-tiny, yolov3, yolov3-spp

YOLOv4:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/detection/yolov4
$ python3 compile.py --model_name yolov4    # Model compile
$ python3 inference.py --model_name yolov4  # Inference

The list of the YOLOv4 variation is:

yolov4, yolov4-csp-s-mish, yolov4-csp-x-mish

YOLOv5:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/detection/yolov5
$ python3 compile.py --model_name yolov5x    # Model compile
$ python3 inference.py --model_name yolov5x  # Inference

The list of the YOLOv5 variation is:

yolov5n, yolov5s, yolov5m, yolov5l, yolov5x

YOLOv6:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/detection/yolov6
$ python3 compile.py --model_name yolov6l    # Model compile
$ python3 inference.py --model_name yolov6l  # Inference

The list of the YOLOv6 variation is:

yolov6s, yolov6n, yolov6m, yolov6l

YOLOv7:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/detection/yolov7
$ python3 compile.py --model_name yolov7    # Model compile
$ python3 inference.py --model_name yolov7  # Inference

The list of the YOLOv7 variation is:

yolov7-tiny, yolov7, yolov7x

YOLOv8:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/detection/yolov8
$ python3 compile.py --model_name yolov8x    # Model compile
$ python3 inference.py --model_name yolov8x  # Inference

The list of the YOLOv8 variation is:

yolov8s, yolov8n, yolov8m, yolov8l, yolov8x

Segmentation

Segmentation is one of the widely used computer vision tasks. Here is the run command for the segmentation model DeeplabV3:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/segmentation/torchvision
$ python3 compile.py --model_name deeplabv3_mobilenetv3_large     # Model compile
$ python3 inference.py --model_name deeplabv3_mobilenetv3_large   # Inference

This is the list of the TorchVision segmentation models. You can try any model listed below with the above command by replacing the --model_name field:

deeplabv3_mobilenetv3_large, deeplabv3_resnet101, deeplabv3_resnet50
fcn_resnet101, fcn_resnet50

Here is the another example, semantic segmentation based on UNet:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/segmentation/unet
$ python3 compile.py     # Model compile
$ python3 inference.py   # Inference

Language (Transformer)

You can run a BERT-large based question-answering task trained on SQuAD v2.0 dataset with following command:

1
2
3
$ cd rbln_model_zoo/pytorch/nlp/bert/qa
$ python3 compile.py --model_name large    # Model compile
$ python3 inference.py --model_name large  # Inference

This is another downstream task, masked language modeling, trained with BookCorpus and English Wikipedia:

1
2
3
$ cd rbln_model_zoo/pytorch/nlp/bert/mlm
$ python3 compile.py --model_name large    # Model compile
$ python3 inference.py --model_name large  # Inference

Here is the variations of the BERT models that are currently provided by RBLN model zoo:

base, large

Vision Transformer

This is the vision transformer based classification example, Data-efficient Image Transformer (DeiT):

1
2
3
$ cd rbln_model_zoo/pytorch/vision/classification/deit
$ python3 compile.py --model_name base_distilled    # Model compile
$ python3 inference.py --model_name base_distilled  # Inference

The size variations of the DeiT model are:

tiny, tiny_distilled, small, small_distilled,
base, base_distilled, base_384, base_distilled_384

Face Alignment

RBLN PyTorch model zoo also offers a face alignment task based on 3DDFA:

1
2
3
$ cd rbln_model_zoo/pytorch/vision/face_alignment/3ddfa
$ python3 compile.py    # Model compile
$ python3 inference.py  # Inference