Skip to content

RBLN SDK USER GUIDE

RBLN SDK is a software stack that simplifies the process of running deep learning workloads on Rebellions' neural processing units, RBLN NPUs. RBLN SDK is designed to help you save time on deployment and maximize the performance of your serving models. Users can take advantage of the power and performance efficiency of RBLN NPU with minimal manual optimization and tuning.

RBLN SDK includes the following proprietary components to enable seamless model deployment in production environment:

  • Driver
  • Compiler
  • Runtime
  • Profiler
  • Serving Frameworks

RBLN SDK supports pre-trained models from TensorFlow, PyTorch, and HuggingFace, making it easy to transition to an RBLN NPU-based serving environment.

Our goal is to simplify the deployment process as much as possible, enabling users to focus on developing and refining their models and fully leverage the benefits of RBLN NPU. The following diagram illustrates the process of serving a deep leaning model on RBLN NPUs using the RBLN SDK:

  1. Prepare a Pre-Trained Model
  2. Compile the Model using RBLN Compiler
  3. Perform Inference using RBLN Runtime APIs (Python & C/C++)
  4. Analyze & Optimize with RBLN Profiler
  5. Deploy for Production with vLLM, Triton Inference Server, or TorchServe

This structured workflow simplifies deployment, enabling users to focus on refining their models while fully leveraging RBLN NPU benefits.

Get Started

For quick access to the RBLN SDK user guide, refer to the following links. If you have any questions or feedback, please contact us. We are always here to help!