Release Notes¶
2024.12.27¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.12.27.0 | v1.1.67 | v0.6.2 | v0.1.15 | v0.1.3 | v0.5.5 | v0.4.0 |
-
Install Command
-
RBLN Compiler
:- Added
get_total_device_alloc()
methods toRBLNCompiledModel
class, enabling efficient retrieval of the total device memory allocation (in bytes) used by the compiled graph across all NPUs - Fixed bug for
split
operation handling logic - Added exception handling for previously uncovered edge cases in the error handling logic
- Added
-
Optimum RBLN
:- Added functions:
- RBLNStableDiffusionInpaintPipeline()
- RBLNStableDiffusionXLInpaintPipeline()
- RBLNStableDiffusionXLControlNetPipeline()
- RBLNStableDiffusionXLControlNetImg2ImgPipeline()
- RBLNStableDiffusion3Pipeline()
- RBLNStableDiffusion3Img2ImgPipeline()
- RBLNStableDiffusion3InpaintPipeline()
- Removed the dependency on optimum
- This change eliminated the automatic installation of GPU-related dependencies, resulting in a significantly faster installation process
- Added functions:
-
vLLM RBLN
:- Updated to sync with vllm v0.6.5
- Updated to support EXAONE v3.5 models
-
RBLN Model Zoo
:- Added new models
- HuggingFace
- EXAONE-3.5-2.4b
- EXAONE-3.5-7.8b
- Stable Diffusion v3
- Text to image
- Image to image
- Inpainting
- Stable Diffusion
- Inpainting
- Stable Diffusion XL
- Inpainting
- Text to image + controlNet
- Image to image + controlNet
- PyTorch
- YOLOv5-Face
- HuggingFace
- Improved the formatting of all model code for better readability and maintainability
- Added new examples demonstrating how to use vLLM's native APIs with a wider range of model architectures:
- Decoder-only (Llama3)
- Encoder-decoder (BART)
- Multi-modal (Llava-next)
- Added supplementary guides for the model serving:
- Tutorial > Advanced > LLM Serving > LLM Serving with Continous Batching: RBLN Model Zoo Link
- Software > Model Serving > Nvidia Triton Infernece Server: RBLN Model Zoo Link
- Added new models
2024.11.27¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.11.27.0 | v1.1.67 | v0.6.1 | v0.1.13 | v0.1.2 | v0.5.4 | v0.4.0 |
Note
Deprecation Notice
: Python 3.8
Support
As part of our commitment to maintaining compatibility with supported and secure versions of Python, we are officailly deprecating support for Python 3.8
. This version will no longer be included in future releases, and users are encouraged to upgrade to a more recent Python version to ensure continued support and compatibility.
New Additions
: Python 3.11
and Python 3.12
We are pleased to announce that Python 3.11
and Python 3.12
are now fully supported and included in our release package.
-
Install Command
-
RBLN Compiler
:- Improved operation efficiency:
- New fusion logic for
masked softmax
- Accelerated
take
operation for tensor indexing
- New fusion logic for
- Enhanced communication logic for RSD models
- Refactored
.rbln
file format
- Improved operation efficiency:
-
Optimum RBLN
:- Added functions:
- RBLNT5EncoderModel()
- Refacotred
RBLNStableDiffusion pipelines
to enhance functionality and flexibility:- LoRA support:
- The pipeline included support for Low-Rank Adaptation (LoRA).
rbln_config
input argument:- A new
rbln_config
argument has been introduced. This configuration is designed specifically for RBLN compilation and incldues:- Global settings: Parameters such as
npu
,device
, andcreate_rutniems
- Image generation settings: Prameters such as
batch_size
,img_height
,img_width
, andguidacne_scale
- Global settings: Parameters such as
- A new
- For more detailed information, please refer to the Model API documentation and the RBLN Model Zoo example
- LoRA support:
- Added a new
from_model()
method to theRBLN<ModelName>ForCausalLM
classes. This method enables LoRA support by accepting a HuggingFacePreTrainedModel
as input, allowing the base model and LoRA adapter to be merged using themerge_and_unload()
approach. For more details, please refer to the RBLN Model Zoo example - Enabled to support various RoPE (Rotary Position Embedding) methods
- default
- linear
- dynamic
- yarn
- longrope
- llama3
- Added functions:
-
vLLM RBLN
:- Updated to sync with vllm v0.6.4
- The
--compiled_model_dir
configuration has been deprecated and will be removed in future release. Users are encourged to use--model
argument instead. Please refer LLM serving with Continous Batching tutorial for actual usecase
-
RBLN Model Zoo
:- Added new models
- HuggingFace
- Qwen2.5-7b/14b
- Llama3-8b + Lora
- StableDiffusion + LoRA
- StableDiffusionXL + LoRA
- HuggingFace
- Dependency Updates
- PyTorch: updated to version 2.5.1
- TensorFlow: updated to version 2.18.0
- Model deprecation
- The 3DDFA model has been deprecated due to maintenance discontinuation
- Added new models
-
RBLNServe
:- Pinned rebel-compiler version to <0.7, >=0.6
2024.10.30¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.10.30.0 | v1.1.67 | v0.5.12 | v0.1.12 | v0.1.0 | v0.5.3 | v0.3.0 |
-
Install Command
-
RBLN Compiler
:- Updated to support
cosine_similarity
operation - Enabled a runtime initialization with
RBLNCompiledModel
and deprecated thepath
argument - Added double buffering on/off option on
AsyncRuntime
creation with theparallel
arugment. - Added
example_info
argument tocompile_from_torch()
to support compilation withoutInputInfo
creation - Added
device
argument totorch.compile
to specify the NPU device ID for execution
- Updated to support
Optimum RBLN
:- Added functions:
- RBLNQwen2ForCausalLM()
- RBLNExaoneForCausalLM()
- RBLNPhiForCausalLM()
- RBLNViTImageClassification()
- Updated to support latest transformers (v4.45.2)
- Added functions:
vLLM RBLN
:- Updated to support Qwen2, EXAONE, and Phi-2 architectures
RBLN Model Zoo
:- Added new models
- HuggingFace
- Qwen2-7b
- EXAONE-3.0-7.8b
- Salamandra-7b
- Phi-2
- ViT-large
- Whisper-large-v3-turbo
- PyTorch Dynamo
- SAM2.1_hiera_large
- HuggingFace
- Separated the framework-specific requirements.txt files into individual requirements.txt files for each model
- Added new models
2024.09.27¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.09.27.0 | v1.1.67 | v0.5.10 | v0.1.11 | v0.0.7 | v0.5.2 | v0.3.0 |
-
Install Command
-
RBLN Driver
:- Added runtime power management with dynamic PCIe link speed change and PCIe ASPM (Active State Power Management)
- Improved P2P throughput
- Enhanced stability for Rebellions Scalable Design (RSD)
RBLN Compiler
:- Improved internal memory management algorithm
- Updated the runtime description to show NPU version
- Refactored .rbln file format
Optimum RBLN
:- Added functions:
- RBLNBertModel()
- RBLNBartModel()
- RBLNLlavaNextForConditionalGeneration()
- Updated
RBLNWhisperForConditionalGeneration
to support generating token timestamps and long-form transcription
- Added functions:
vLLM RBLN
:- Updated to support Llava-Next, BART, and T5 models
RBLN Model Zoo
:- Added new models
- HuggingFace
- Llava-Next
- E5-Base-4k
- KoBART
- BGE-Reranker-Base/Large
- PyTorch
- MotionBERT Action Recognition
- HuggingFace
- Updated Whisper models to support generating token timestamps and long-form transcription
- Added new models
Others
- Split
LLM Serving
tutorial into with Triton Inference Server and with Continuous Batching - Added vLLM API standalone example in LLM serving tutorial
- Split
2024.08.30¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.08.30.0 | v1.0.1 | v0.5.9 | v0.1.9 | - | v0.5.1 | v0.3.0 |
2024.08.30.1 | v1.0.5 | v0.5.9 | v0.1.9 | v0.0.6 | v0.5.1 | v0.3.0 |
-
Install Command
-
RBLN Compiler
:- Added
model_description()
method inRuntime
class - Updated to support
where
andeinsum
operations - Fixed bug for
strided_slice
operation
- Added
Optimum RBLN
:- Added functions:
- RBLNGemmaForCausalLM()
- RBLNMistralForCausalLM()
- RBLNDistilBertForQuestionAnswering()
- Added functions:
vLLM RBLN
:- Updated to support Gemma and Mistral architectures
RBLN Model Zoo
:- Added new models
- HuggingFace
- Gemma-2B
- Gemma-7B
- Mistral-7B
- DistilBERT
- PyTorch
- MotionBERT
- PyTorch Dynamo
- YOLOv3
- YOLOv4
- YOLOv5
- YOLOv6
- YOLOvX
- HuggingFace
- Added new models
2024.08.16¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.08.16.0 | v1.0.1 | v0.5.8 | v0.1.8 | - | v0.5.0 | v0.3.0 |
2024.08.16.1 | v1.0.5 | v0.5.8 | v0.1.8 | v0.0.4 | v0.5.0 | v0.3.0 |
-
Install Command
-
RBLN Compiler
:- Improved visualization of the compilation progress bar
- Optimized performance for long sequence LLM models
- Reduced DRAM memory consumption for RSD models
- Fixed bug for PReLU handling logic
- Initial release of C/C++ runtime libraries:
- Refer Software > API > Language Binding > C/C++ for installation, API docs, and tutorials.
Optimum RBLN
:- Added functions:
- RBLNRobertaForMaskedLM()
- RBLNRobertaForSequenceClassification()
- RBLNXLMRobertaModel()
- RBLNXLMRobertaForSequenceClassification()
- Added functions:
vLLM RBLN
:- Updated to support GPT2 and Mi:dm architectures
RBLN Model Zoo
:- Initial release to support
torch.compile()
in PyTorch2.0:- Visit Tutorial > Basic > PyTorch (Vision), Tutorial > Basic > PyTorch (NLP), and Software > API > Python API pages for more information
- Examples can be found in the Model Zoorepository
- Added new models (HuggingFace)
- Mi:dm-7b
- BGE-M3
- BGE-Reranker-v2-M3
- SecureBERT
- Roberta
- Initial release to support
2024.07.25¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.07.25.0 | v1.0.1 | v0.5.7 | v0.1.7 | - | v0.4.1 | v0.3.0 |
2024.07.25.1 | v1.0.5 | v0.5.7 | v0.1.7 | v0.0.3 | v0.4.1 | v0.3.0 |
-
Install Command
-
RBLN Compiler
:- Optimized RSD performance for long sequence LLMs
Optimum RBLN
:- Added warning messge for dependency version compatibilities
- Added RBLNDPTForDepthEstimation() functiuon
- Fixed bug for memory leak in GPT models
RBLN Model Zoo
:- Added a new model (HuggingFace)
- DPT-large
- Added a new model (HuggingFace)
Others
- Updated LLM Serving tutorial page
- Revised the Serving with Triton Inference Server and Continuous Batching Support with vllm-rbln sections
- Added OpenAI Compatible API Server section
- Updated LLM Serving tutorial page
2024.07.10¶
SDK Version | Driver | Compiler | Optimum RBLN | vLLM RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|---|
2024.07.10.0 | v1.0.1 | v0.5.2 | v0.1.4 | - | v0.4.0 | v0.3.0 |
2024.07.10.1 | v1.0.5 | v0.5.2 | v0.1.4 | v0.0.3 | v0.4.0 | v0.3.0 |
RBLN Driver
:- Enhanced stability for Rebellions Scalable Design (RSD)
RBLN Compiler
:- Updated to support continuous batching
Optimum RBLN
:- Updated
LlamaForCausalLM()
class to support continuous batching
- Updated
vLLM RBLN
- Initial release to support continuous batching
- Updated the LLM Serving page to include information on continuous batching
RBLN Model Zoo
:- Public release of the
RBLN Model Zoo
: - Added a new model (PyTorch)
- ConvTasNet
- Miscellaneous:
- Removed
pipeline()
from bert mlm inference.py - Removed
pipeline()
from bert qa inference.py - Added
trust_remote_code=True
toload_dataset()
methods in ast & wav2vec examples.
- Removed
- Public release of the
2024.06.11: Breaking changes¶
SDK Version | Driver | Compiler | Optimum RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|
2024.05.23.0 | v0.10.42 | v0.3.11 | v0.1.0 | v0.3.6 | v0.1.5 |
2024.06.11.0 | v1.0.1 | v0.5.0 | v0.1.1 | v0.3.10 | v0.3.0 |
Note
BREAKING CHANGES
: Please update the RBLN Compiler
to the appropriate version as below for compatibility with the updated RBLN Driver
. You can check your RBLN Driver
version with the rbln-stat -j | grep KMD_version
command.
0.10.42
:pip install -i https://pypi.rbln.ai/simple rebel-compiler==0.3.11
1.0.1
:pip install -i https://pypi.rbln.ai/simple rebel-compiler==0.5.0
RBLN Driver
:- Stable release for Rebellions Scalable Design (RSD)
RBLN Compiler
:- Updated
RBLN Compiler
to be compatible withRBLN Driver
- Added utility APIs:
npu_is_available()
get_npu_name()
- Updated
Optimum RBLN
:- Updated model APIs
- Please refer to
RBLN Model Zoo
below
- Please refer to
- Updated model APIs
RBLN Model Zoo
:- Added new models (HuggingFace)
- With Rebellions Scalable Design (RSD)
- Llama3-8b
- SOLAR-10.7b
- EEVE-Korean-10.8b
- SDXL-base-1.0
- ControlNet
- With Rebellions Scalable Design (RSD)
- Added new models (HuggingFace)
RBLNServe
:- Pinned rebel-compiler version to <0.6, >=0.5
2024.05.23: Breaking Changes¶
SDK Version | Driver | Compiler | Optimum RBLN | Model Zoo | RBLNServe |
---|---|---|---|---|---|
2024.05.23.0 | v0.10.42 | v0.3.11 | v0.1.0 | v0.3.6 | v0.1.5 |
2024.05.23.1 | v0.12.37 | v0.4.0 | v0.1.0 | v0.3.6 | v0.2.0 |
Note
BREAKING CHANGES
: Please update the RBLN Compiler
to the appropriate version as below for compatibility with the updated RBLN Driver
. You can check your RBLN Driver
version with the rbln-stat -j | grep KMD_version
command.
0.10.42
:pip install -i https://pypi.rbln.ai/simple rebel-compiler==0.3.11
0.12.37
:pip install -i https://pypi.rbln.ai/simple rebel-compiler==0.4.0
RBLN Driver
:- Added support for Rebellions Scalable Design (RSD)
- rbln-stat (CLI tool) update: Added new columns
Name
andPower
for NPU version and power consumption, respectively
RBLN Compiler
:- Updated
RBLN Compiler
to be compatible with theRBLN Driver
- Updated input arguments of python user APIs
- Added new user APIs for concurrent processing
- Enabled LLM compilation & inference for Rebellions Scalable Design (RSD)
- Added a new page - Nvidia Triton Inference Server
- Updated
Optimum RBLN
:- Initial release
- Add new pages - HuggingFace Model Support
RBLN Model Zoo
:- Added new models (HuggingFace)
- With Rebellions Scalable Design (RSD)
- Llama2-7b
- Llama2-13b
- GPT2, GPT2-medium/large/xl
- T5-small/base/large/3B
- BART-base/large
- BERT-base/large
- Stable Diffusion v1.5
- SDXL-turbo
- Whisper-tiny/base/small/medium/large
- Wav2Vec2
- Audio Spectrogram Transformer
- With Rebellions Scalable Design (RSD)
- Added new models (HuggingFace)
RBLNServe
:- Pinned rebel-compiler version to <0.5, >=0.4
2024.01.31: Breaking Changes¶
SDK Version | Driver | Compiler | Model Zoo | RBLNServe |
---|---|---|---|---|
2024.01.31.0 | v0.10.42 | v0.3.5 | v0.2.0 | v0.1.5 |
Note
BREAKING CHANGES
: Please update the RBLN Compiler
to the latest version (v0.3.5
or higher) for compatibility with the updated RBLN Driver
.
RBLN Driver
:- Refactored device internal command processing logic for stability & scalability
RBLN Compiler
:- Updated
RBLN Compiler
to be compatible with theRBLN Driver
- Updated device memory scheduling logic
- Enhanced functionality for operation fusion logic
- Updated supported OP list for both TensorFlow and Pytorch
- Updated
RBLN Model Zoo
:- Added new models (PyTorch):
- YOLOv4: v4/v4-csp-s-mish/v4-csp-x-mish
- Video ResNet: r3d_18/mc3_18/r2plus1D_18
- Video S3D: s3d
- Changed default input size:
- YOLOv3/4/5/6/7/8
- deeplabv3_resnet50/resnet101/mobilenetv3_large, fcn_resnet50/101, unet
- Restructured directories:
- PyTorch image classification examples are moved from
rbln_model_zoo/pytorch/vision/classification
torbln_model_zoo/pytorch/vision/image_classification
- PyTorch image classification examples are moved from
- Added new models (PyTorch):
RBLNServe
:- Set rebel-compiler version pinned to <0.4, >=0.3
2023.10.06¶
SDK Version | Driver | Compiler | Model Zoo | RBLNServe |
---|---|---|---|---|
2023.10.06.0 | v0.9.34 | v0.2.13 | v0.1.9 | v0.1.4 |
RBLN Compiler
:- Updated version parsing module of runtime APIs
- Updated runtime input size calculation logic
- Enhanced functionality for tensor slicing operations
RBLNServe
:- Updated configuration for gRPC/REST protocol
2023.09.12¶
SDK Version | Driver | Compiler | Model Zoo | RBLNServe |
---|---|---|---|---|
2023.09.12.0 | v0.9.34 | v0.2.10 | v0.1.9 | v0.1.1 |
RBLN Compiler
:- Enabled
print()
for therebel.Runtime
module -print(module)
will show basic information of the loaded model - Refactored compiler internal large op handling passes for scalability
- Updated error message handling logic
- Fixed bug in a type cast pass
- Enabled
RBLN Model Zoo
:- Updated submodule - YOLOv3
RBLNServe
:- Added
--version
command
- Added
2023.08.18¶
SDK Version | Driver | Compiler | Model Zoo | RBLNServe |
---|---|---|---|---|
2023.08.18.0 | v0.9.34 | v0.2.1 | v0.1.8 | v0.1.0 |
RBLN Compiler
:- Fixed bug for the destruction issue in
rebel.Runtime
- Fixed bug for the destruction issue in
RBLNServe
:- Initial release
- Added a new page - RBLNServe (Model Server)
2023.08.12: Breaking Changes¶
SDK Version | Driver | Compiler | Model Zoo |
---|---|---|---|
2023.08.12.0 | v0.9.34 | v0.2.0 | v0.1.8 |
Note
BREAKING CHANGES
: Please update the RBLN Compiler
to the latest version (v0.2.0
or higher) for compatibility with the updated RBLN Driver
.
RBLN Driver
:- Refactored host-device communication protocol for stability & scalability
RBLN Compiler
:- Updated
RBLN Compiler
to be compatible with theRBLN Driver
- Updated
Others
:- Added a new page - Kubernetes Support
2023.07.31¶
SDK Version | Driver | Compiler | Model Zoo |
---|---|---|---|
2023.07.31.0 | v0.8.44 | v0.1.17 | v0.1.8 |
RBLN Compiler
:- Enhanced functionality for normalization operations
- Updated compiler internal scheduling logic
- Updated error message handling logic
RBLN Model Zoo
:- Updated requirements.txt to use ultralytics 8.0.145
- Applied ultralytics 8.0.145 to YOLOv8
2023.07.10¶
SDK Version | Driver | Compiler | Model Zoo |
---|---|---|---|
2023.07.10.0 | v0.8.44 | v0.1.14 | v0.1.7 |
RBLN Driver
:- Enhanced stability for device reset and recovery
- rbln-stat (CLI tool) update: status categorization of the process
RBLN Compiler
:- Updated input arguments for
compile_from_torchscript()
- Enhanced functionality for unary and binary operations
- Optimized build time
- Updated input arguments for
RBLN Model Zoo
:- Added new models (PyTorch):
- YOLOv6: v6s/v6n/v6m/v6l
- YOLOv7: v7-tiny/v7/v7x
- YOLOv8: v8s/v8n/v8m/v8l/v8x
- Added new models (TF Keras Applications)
- MobileNetV3: Small/Large
- ConvNeXt: Tiny/Small/Base/Large/XLarge
- RegNetX: 002/004/006/008/016/032/040/064/080/120/160/320
- RegNetY: 002/004/006/008/016/032/040/064/080/120/160/320
- Added new models (PyTorch):
2023.06.20¶
SDK Version | Driver | Compiler | Model Zoo |
---|---|---|---|
2023.06.20.0 | v0.7.34 | v0.1.8 | v0.1.5 |
RBLN Compiler
:- Added a new compile function -
compile_from_torchscript()
- Enhanced functionality for matrix multiplication and pooling operations
- Optimized device memory scheduling
- Added a new compile function -
RBLN Model Zoo
:- Added new models (PyTorch):
- YOLOv3: v3-tiny/v3/v3-spp
- YOLOv5: v5s/v5n/v5m/v5l/v5x
- Added new models (PyTorch):
2023.05.26¶
SDK Version | Driver | Compiler | Model Zoo |
---|---|---|---|
2023.05.26.0 | v0.7.34 | v0.1.5 | v0.1.4 |
- Initial release