TorchServe¶

TorchServe is an open-source model-serving framework optimized for deploying PyTorch models in production environment. You can use RBLN SDK through a custom handler in TorchServe environment, enabling AI model serving on Rebellions’ NPUs.

Getting Started¶

TorchServe Installation¶

The TorchServe GitHub repository provides installation scripts to resolve dependencies required for setting up the environment. After running the installation script, please install the necessary packages for TorchServe.

$ git clone https://github.com/pytorch/serve.git
$ cd serve
$ python3 ./ts_scripts/install_dependencies.py
$ pip3 install torchserve torch-model-archiver torch-workflow-archiver

For more details, please refer to TorchServe official documentation.

Tutorial¶

We provide comprehensive tutorials that demonstrate how to enable TorchServe with RBLN SDK:

Resnet50 explains how to serve an image classification model with TorchServe
YOLOv8 demonstrates how to serve an object detection model with TorchServe
Llama3-8B explains how to serve an LLM using vLLM-enabled TorchServe
Llama3.1-8B with Flash Attention guide on how to enable Flash Attention with TorchServe