콘텐츠로 이동

Model API

모델 API 문서는 명확한 이해를 위해 영문으로 작성되어 있습니다.

optimum API

Generic model classes

Classes

RBLNBaseModel

An abstract base class for compiling, loading, and saving neural network models from the huggingface transformers and diffusers libraries to run on RBLN NPU devices.

This class supports loading and saving models using the from_pretrained and save_pretrained methods, similar to the huggingface libraries.

The from_pretrained method loads a model corresponding to the given model_id from a local repository or the HuggingFace Hub onto the NPU. If the model is a PyTorch model and export=True is passed as a kwarg, it compiles the PyTorch model corresponding to the given model_id before loading. If model_id is an already rbln-compiled model, it can be directly loaded onto the NPU with export=False.

rbln_npu is a kwarg required for compilation, specifying the name of the NPU to be used. If this keyword is not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

rbln_device specifies the device to be used at runtime. If not specified, device 0 is used.

rbln_create_runtimes indicates whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

rbln_config is a dictionary that allows passing configurations for the model and its submodules. Any parameter prefixed with rbln_ in the from_pretrained method is internally interpreted as a value in rbln_config.

For example, rbln_batch_size=4 is equivalent to passing rbln_config={"batch_size": 4}.

Example usage of rbln_config:

1
2
3
4
5
6
7
model = RBLNBaseModel.from_pretrained(
    model_id,
    export=True,
    rbln_config={
        "batch_size": 4,
    },
)

This is equivalent to:

1
2
3
4
5
model = RBLNBaseModel.from_pretrained(
    model_id,
    export=True,
    rbln_batch_size=4,
)

Models compiled in this way can be saved to a local repository using save_pretrained or uploaded to the huggingface hub.

It also supports generation through generate (for transformers models that support generation).

RBLNBaseModel is a class for models consisting of an arbitrary number of torch.nn.Modules, and therefore is an abstract class without explicit implementations of forward or export functions. To inherit from this class, forward, export, etc. must be implemented.

Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_config=None, **kwargs) classmethod

Load a pretrained model from a given model ID and optimize it for NPU execution.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model or compiled model to be loaded. It can be downloaded from the HuggingFace model hub, a local path.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_config Optional[Dict[str, Any]]

A dictionary containing configuration for the model and its submodules. This affects the compilation settings for both the main module and its submodules. **kwargs: Additional keyword arguments. Any argument prefixed with 'rbln_' will be treated as part of rbln_config. Arguments without the 'rbln_' prefix will be passed directly to the original Huggingface's from_pretrained method.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNModel

A class that inherits from RBLNBaseModel for models consisting of a single torch.nn.Module.

This class supports all the functionality of RBLNBaseModel, including loading and saving models using the from_pretrained and save_pretrained methods, compiling PyTorch models for execution on RBLN NPU devices.

model = RBLNModel.from_pretrained("model_id", export=True, rbln_npu="npu_name")
outputs = model(**inputs)

Natural Language Processing

Classes

RBLNLlamaForCausalLM

The Llama model transformer with a language modeling head (linear layer) on top. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models. A class to convert and run pre-trained HuggingFace transformer-based LlamaForCausalLM. It implements the methods to convert a pre-trained transformers LlamaForCausalLM into a RBLNLlamaForCausalLM by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[Union[int, List[int]]]

the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_tensor_parallel_size Optional[int]

Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNGemmaForCausalLM

The Gemma model transformer with a language modeling head (linear layer) on top. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models. A class to convert and run pre-trained HuggingFace transformer-based GemmaForCausalLM. It implements the methods to convert a pre-trained transformers GemmaForCausalLM into a RBLNGemmaForCausalLM by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[Union[int, List[int]]]

the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_tensor_parallel_size Optional[int]

Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNMistralForCausalLM

The Mistral model transformer with a language modeling head (linear layer) on top. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models. A class to convert and run pre-trained HuggingFace transformer-based MistralForCausalLM. It implements the methods to convert a pre-trained transformers MistralForCausalLM into a RBLNMistralForCausalLM by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. Users can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[Union[int, List[int]]]

the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_tensor_parallel_size Optional[int]

Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNQwen2ForCausalLM

The Qwen2 Model transformer with a language modeling head (linear layer) on top. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models. A class to convert and run pre-trained HuggingFace transformer-based Qwen2ForCausalLM. It implements the methods to convert a pre-trained transformers Qwen2ForCausalLM into a RBLNQwen2ForCausalLM by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. Users can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[Union[int, List[int]]]

the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_tensor_parallel_size Optional[int]

Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNExaoneForCausalLM

The EXAONE Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings). This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models. A class to convert and run pre-trained HuggingFace transformer-based ExaoneForCausalLM. It implements the methods to convert a pre-trained transformers ExaoneForCausalLM into a RBLNExaoneForCausalLM by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. Users can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[Union[int, List[int]]]

the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_tensor_parallel_size Optional[int]

Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNPhiForCausalLM

The Phi model transformer with a language modeling head (linear layer) on top. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models. A class to convert and run pre-trained HuggingFace transformer-based PhiForCausalLM. It implements the methods to convert a pre-trained transformers PhiForCausalLM into a RBLNPhiForCausalLM by: - transferring the checkpoint weights of the original into an optimized RBLN graph, - compiling the resulting graph using the RBLN compiler.

Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library. Args: model_id: The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler. export: A boolean flag to indicate if the model should be exported to a .rbln file. rbln_npu: The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. rbln_device: the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs. rbln_create_runtimes: A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. rbln_batch_size: The batch size of the model. rbln_max_seq_len: The maximum sequence length of the model. rbln_tensor_parallel_size: Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model. Args: input_ids: The sequence used as a prompt for the generation attention_mask: The attention mask to apply on the sequence max_length: The maximum length of the sequence to be generated Returns: torch.Tensor

RBLNGPT2LMHeadModel

The GPT2 Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).

This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its model.

It implements the methods to convert a pre-trained GPT2LMHeadModel into RBLNGPT2LMHeadModel by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model. Custom configuration is available like input_ids, attention_mask, max_length, etc.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNT5ForConditionalGeneration

RBLN implementation of T5ForConditionalGeneration, optimized for NPU execution.

This class provides an interface compatible with HuggingFace's T5ForConditionalGeneration, but with RBLN-specific optimizations. It implements three key methods:

  • from_pretrained: Loads a pre-trained T5 model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
  • generate: Generates new text sequences based on input prompts, similar to the HuggingFace implementation.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_enc_max_seq_len=None, rbln_dec_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_enc_max_seq_len Optional[int]

The maximum sequence length of the encoder. If not specified, model config's value is used.

None
rbln_dec_max_seq_len Optional[int]

The maximum sequence length of the decoder. If not specified, model config's value is used.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNBartForConditionalGeneration

RBLN implementation for BART (Bidirectional and Auto-Regressive Transformers), optimized for NPU execution.

This class provides an interface compatible with HuggingFace's BartForConditionalGeneration, but with RBLN-specific optimizations. It implements three key methods:

  • from_pretrained: Loads a pre-trained BartForConditionalGeneration model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
  • generate: Generates new text sequences based on input prompts, similar to the HuggingFace implementation.

Note: As of now, beam search in the generate method is not supported.

Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_enc_max_seq_len=None, rbln_dec_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_enc_max_seq_len Optional[int]

The maximum sequence length of the encoder. If not specified, model config's value is used.

None
rbln_dec_max_seq_len Optional[int]

The maximum sequence length of the decoder. If not specified, model config's value is used.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNBertForQuestionAnswering

RBLN implementation for BertForQuestionAnswering, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's BertForQuestionAnswering, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained BertForQuestionAnswering model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNDistilBertForQuestionAnswering

RBLN implementation for DistilBertForQuestionAnswering, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's DistilBertForQuestionAnswering, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained BertForQuestionAnswering model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNMidmLMHeadModel

The Mi:dm model transformer with a language modeling head (linear layer) on top. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models. A class to convert and run pre-trained HuggingFace transformer-based MidmLMHeadModel. It implements the methods to convert a pre-trained transformers MidmLMHeadModel into a RBLNMidmLMHeadModel by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, trust_remote_code=True, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
trust_remote_code bool

A boolean flag to allow or disallow the execution of custom code from the model repository. If set to True, it permits the model to execute custom code in the model repository, which may include additional model architectures, tokenizers, or processing scripts. Set this to False to enforce stricter security when loading models from untrusted sources.

True
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[Union[int, List[int]]]

the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_tensor_parallel_size Optional[int]

Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12). You can check the type of your current RBLN NPU using the rbln-stat command.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_ids, attention_mask=None, max_length=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_ids LongTensor

The sequence used as a prompt for the generation

required
attention_mask Optional[Tensor]

The attention mask to apply on the sequence

None
max_length Optional[int]

The maximum length of the sequence to be generated

None

Returns:

Type Description

torch.Tensor

RBLNRobertaForMaskedLM

RBLN implementation for RobertaForMaskedLM, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's RobertaForMaskedLM, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained RobertaForMaskedLM model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNRobertaForSequenceClassification

RBLN implementation for RobertaForSequenceClassification, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's RobertaForSequenceClassification, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained RobertaForSequenceClassification model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNXLMRobertaModel

RBLN implementation for XLMRobertaModel, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's XLMRobertaModel, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained XLMRobertaModel model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNXLMRobertaForSequenceClassification

RBLN implementation for XLMRobertaForSequenceClassification, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's XLMRobertaForSequenceClassification, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained XLMRobertaForSequenceClassification model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNBertModel

RBLN implementation for BertModel, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's BertModel, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained BertModel model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_model_input_names=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_model_input_names Optional[List[int]]

A list of inputs expected in the forward pass of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNBartModel

RBLN implementation for BartModel, optimized for execution on NPU devices.

This class provides an interface compatible with HuggingFace's BartModel, but with optimizations for RBLN NPUs. It implements two key methods:

  • from_pretrained: Loads a pre-trained BartModel model and converts it into an optimized RBLN graph.
  • save_pretrained: Saves the compiled RBLN model for efficient reuse.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_model_input_names=None) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_max_seq_len Optional[int]

The maximum sequence length of the model.

None
rbln_model_input_names Optional[List[int]]

A list of inputs expected in the forward pass of the model.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required

Multi Modal

Classes

RBLNLlavaNextForConditionalGeneration

RBLNLlavaNextForConditionalGeneration is a multi-modal model that combines vision and language processing capabilities, optimized for RBLN NPUs. It is designed for conditional generation tasks that involve both image and text inputs.

This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models.

Important Note

This model includes a Large Language Model (LLM) as a submodule. For optimal performance, it is highly recommended to use tensor parallelism for the language model. This can be achieved by using the rbln_config parameter in the from_pretrained method. Here's an example of how to apply tensor parallelism:

model = RBLNLlavaNextForConditionalGeneration.from_pretrained(
    model_id,
    export=True,
    rbln_config={
        "language_model": {
            "tensor_parallel_size": 4,  # Apply tensor parallelism
            "max_seq_len": 32768,
            "use_inputs_embeds": True,
            "batch_size": 1,
        },
        "vision_feature_select_strategy": "default"
    },
)
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_vision_feature_select_strategy=None, rbln_config=None) classmethod

Load a pretrained RBLNLlavaNextForConditionalGeneration model from a given model ID and optimize it for RBLN NPUs.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub, a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_vision_feature_select_strategy Optional[str]

Strategy for selecting vision features. If not specified, the default strategy from the model config is used.

None
rbln_config Optional[Dict[str, Any]]

A dictionary containing configurations for the main module and its submodules in RBLNLlavaNext. This is particularly important for applying tensor parallelism to the language model for optimal performance. Refer to the class docstring for an example of how to use this parameter.

None
save_pretrained(save_directory)

Save a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

Directory to which to save. Will be created if it doesn't exist.

required

Stable Diffusion

Classes

RBLNStableDiffusionXLPipeline

Pipeline for text-to-image generation using Stable Diffusion XL.

This model inherits from [StableDiffusionXLPipeline]. Check the superclass documentation for the generic methods the library implements for all the pipelines (such as downloading or saving, etc.)

It implements the methods to convert a pre-trained StableDiffusionXLPipeline into a RBLNStableDiffusionXLPipeline by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace diffusers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id `Union[str, Path]`

Can be either:

  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
  • A path to a directory containing a model saved using save_pretrained,
required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_guidance_scale float

The guidance scale value, which should be specified at compile time as it affects the input shape of the unet submodule in Stable Diffusion by determining whether classifier-free guidance is used. A higher value encourages the model to generate images closely linked to the text prompt, potentially at the expense of lower image quality.

5.0
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
__call__(prompt=None, num_inference_steps=50, guidance_scale=5.0, generator=None)

Function invoked when calling the pipeline for generation.

Parameters:

Name Type Description Default
prompt Union[str, List[str]]

The prompt or prompts to guide the image generation. If not defined, one has to pass prompt_embeds. instead.

None
num_inference_steps int

The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.

50
guidance_scale float

Guidance scale as defined in Classifier-Free Diffusion Guidance. guidance_scale is defined as w of equation 2. of Imagen Paper. Guidance scale is enabled by setting guidance_scale > 1. Higher guidance scale encourages to generate images that are closely linked to the text prompt, usually at the expense of lower image quality.

5.0
generator Optional[Union[Generator, List[Generator]]]

One or a list of torch generator(s) to make generation deterministic.

None

Returns:

Type Description
StableDiffusionPipelineOutput

Generated Images and bools indicating whether the corresponding generated image contains "not-safe-for-work" (nsfw) content.

RBLNStableDiffusionXLImg2ImgPipeline

Pipeline for image-to-image generation using Stable Diffusion XL.

This model inherits from [StableDiffusionXLPipeline]. Check the superclass documentation for the generic methods the library implements for all the pipelines (such as downloading or saving, etc.)

It implements the methods to convert a pre-trained StableDiffusionXLPipeline into a RBLNStableDiffusionXLPipeline by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, rbln_img_width, rbln_img_height, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace diffusers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id `Union[str, Path]`

Can be either:

  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
  • A path to a directory containing a model saved using save_pretrained,
required
rbln_img_width int

The width of the image to be generated.

required
rbln_img_height int

The height of the image to be generated.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_guidance_scale float

The guidance scale value, which should be specified at compile time as it affects the input shape of the unet submodule in Stable Diffusion by determining whether classifier-free guidance is used. A higher value encourages the model to generate images closely linked to the text prompt, potentially at the expense of lower image quality.

5.0
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
__call__(prompt=None, image=None, strength=0.8, num_inference_steps=50, guidance_scale=7.5, generator=None)

The call function to the pipeline for generation.

Parameters:

Name Type Description Default
prompt Union[str, List[str]]

The prompt or prompts to guide image generation. If not defined, you need to pass prompt_embeds.

None
image PipelineImageInput

Image, numpy array or tensor representing an image batch to be used as the starting point. For both numpy array and pytorch tensor, the expected value range is between [0, 1] If it's a tensor or a list or tensors, the expected shape should be (B, C, H, W) or (C, H, W). If it is a numpy array or a list of arrays, the expected shape should be (B, H, W, C) or (H, W, C) It can also accept image latents as image, but if passing latents directly it is not encoded again.

None
strength `float`, *optional*, defaults to 0.8

Indicates extent to transform the reference image. Must be between 0 and 1. image is used as a starting point and more noise is added the higher the strength. The number of denoising steps depends on the amount of noise initially added. When strength is 1, added noise is maximum and the denoising process runs for the full number of iterations specified in num_inference_steps. A value of 1 essentially ignores image.

0.8
num_inference_steps int

The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.

50
guidance_scale Optional[float]

A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. Guidance scale is enabled when guidance_scale > 1.

7.5
generator Optional[Union[Generator, List[Generator]]]

A torch.Generator to make generation deterministic.

None

Returns:

Type Description
StableDiffusionPipelineOutput

Generated Images and bools indicating whether the corresponding generated image contains "not-safe-for-work" (nsfw) content.

RBLNStableDiffusionPipeline

Pipeline for text-to-image generation using Stable Diffusion.

This model inherits from [StableDiffusionPipeline]. Check the superclass documentation for the generic methods implemented for all pipelines (downloading, saving, etc.).

It implements the methods to convert a pre-trained StableDiffusionPipeline into a RBLNStableDiffusionPipeline by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace diffusers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id `Union[str, Path]`

Can be either:

  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
  • A path to a directory containing a model saved using save_pretrained,
required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_guidance_scale float

The guidance scale value, which should be specified at compile time as it affects the input shape of the unet submodule in Stable Diffusion by determining whether classifier-free guidance is used. A higher value encourages the model to generate images closely linked to the text prompt, potentially at the expense of lower image quality.

5.0
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
__call__(prompt=None, num_inference_steps=50, guidance_scale=7.5, generator=None)

The call function to the pipeline for generation.

Parameters:

Name Type Description Default
prompt Union[str, List[str]]

The prompt or prompts to guide image generation. If not defined, you need to pass prompt_embeds.

None
num_inference_steps int

The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.

50
guidance_scale float

A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. Guidance scale is enabled when guidance_scale > 1.

7.5
generator Optional[Union[Generator, List[Generator]]]

A torch.Generator to make generation deterministic.

None

Returns:

Type Description
StableDiffusionPipelineOutput

Generated Images and bools indicating whether the corresponding generated image contains "not-safe-for-work" (nsfw) content.

RBLNStableDiffusionImg2ImgPipeline

Pipeline for image-to-image generation using Stable Diffusion.

This model inherits from [StableDiffusionPipeline]. Check the superclass documentation for the generic methods implemented for all pipelines (downloading, saving, etc.).

It implements the methods to convert a pre-trained StableDiffusionImg2ImgPipeline into a RBLNStableDiffusionImg2ImgPipeline by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, rbln_img_width, rbln_img_height, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace diffusers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id `Union[str, Path]`

Can be either:

  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
  • A path to a directory containing a model saved using save_pretrained,
required
rbln_img_width int

The width of the image to be generated.

required
rbln_img_height int

The height of the image to be generated.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_guidance_scale float

The guidance scale value, which should be specified at compile time as it affects the input shape of the unet submodule in Stable Diffusion by determining whether classifier-free guidance is used. A higher value encourages the model to generate images closely linked to the text prompt, potentially at the expense of lower image quality.

5.0
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
__call__(prompt=None, image=None, strength=0.8, num_inference_steps=50, guidance_scale=7.5, generator=None)

The call function to the pipeline for generation.

Parameters:

Name Type Description Default
prompt Union[str, List[str]]

The prompt or prompts to guide image generation. If not defined, you need to pass prompt_embeds.

None
image PipelineImageInput

Image, numpy array or tensor representing an image batch to be used as the starting point. For both numpy array and pytorch tensor, the expected value range is between [0, 1] If it's a tensor or a list or tensors, the expected shape should be (B, C, H, W) or (C, H, W). If it is a numpy array or a list of arrays, the expected shape should be (B, H, W, C) or (H, W, C) It can also accept image latents as image, but if passing latents directly it is not encoded again.

None
strength float

Indicates extent to transform the reference image. Must be between 0 and 1. image is used as a starting point and more noise is added the higher the strength. The number of denoising steps depends on the amount of noise initially added. When strength is 1, added noise is maximum and the denoising process runs for the full number of iterations specified in num_inference_steps. A value of 1 essentially ignores image.

0.8
num_inference_steps Optional[int]

The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. This parameter is modulated by strength.

50
guidance_scale Optional[float]

A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. Guidance scale is enabled when guidance_scale > 1.

7.5
generator Optional[Union[Generator, List[Generator]]]

A torch.Generator to make generation deterministic.

None

Returns:

Type Description
StableDiffusionPipelineOutput

Generated Images and bools indicating whether the corresponding generated image contains "not-safe-for-work" (nsfw) content.

RBLNStableDiffusionControlNetPipeline

Pipeline for text-to-image generation using Stable Diffusion and ContorlNet.

This model inherits from [StableDiffusionControlNetPipeline]. Check the superclass documentation for the generic methods implemented for all pipelines (downloading, saving, etc.).

It implements the methods to convert a pre-trained StableDiffusionControlNetPipeline into a RBLNStableDiffusionControlNetPipeline by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, rbln_img_width, rbln_img_height, controlnet, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace diffusers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id `Union[str, Path]`

Can be either:

  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
  • A path to a directory containing a model saved using save_pretrained,
required
rbln_img_width int

The width of the image to be generated.

required
rbln_img_height int

The height of the image to be generated.

required
controlnet ControlNetModel

Provides additional conditioning to the unet during the denoising process.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_guidance_scale float

The guidance scale value, which should be specified at compile time as it affects the input shape of the unet submodule in Stable Diffusion by determining whether classifier-free guidance is used. A higher value encourages the model to generate images closely linked to the text prompt, potentially at the expense of lower image quality.

5.0
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
__call__(prompt=None, image=None, num_inference_steps=50, guidance_scale=7.5, negative_prompt=None, controlnet_conditioning_scale=1.0, generator=None)

The call function to the pipeline for generation.

Parameters:

Name Type Description Default
prompt Optional[Union[str, List[str]]]

The prompt or prompts to guide image generation. If not defined, you need to pass prompt_embeds.

None
image Optional[PipelineImageInput]

The ControlNet input condition to provide guidance to the unet for generation. If the type is specified as torch.Tensor, it is passed to ControlNet as is. PIL.Image.Image can also be accepted as an image. The dimensions of the output image defaults to image's dimensions. If height and/or width are passed, image is resized accordingly. If multiple ControlNets are specified in init, images must be passed as a list such that each element of the list can be correctly batched for input to a single ControlNet. When prompt is a list, and if a list of images is passed for a single ControlNet, each will be paired with each prompt in the prompt list. This also applies to multiple ControlNets, where a list of image lists can be passed to batch for each prompt and each ControlNet.

None
num_inference_steps int

The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.

50
guidance_scale float

A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. Guidance scale is enabled when guidance_scale > 1.

7.5
negative_prompt Optional[Union[str, List[str]

The prompt or prompts to guide what to not include in image generation. If not defined, you need to pass negative_prompt_embeds instead. Ignored when not using guidance (guidance_scale < 1).

None
controlnet_conditioning_scale Optional[Union[float, List[float]

The outputs of the ControlNet are multiplied by controlnet_conditioning_scale before they are added to the residual in the original unet. If multiple ControlNets are specified in init, you can set the corresponding scale as a list.

1.0
generator Optional[Union[Generator, List[Generator]]]

A torch.Generator to make generation deterministic.

None
RBLNStableDiffusionControlNetImg2ImgPipeline

Pipeline for image-to-image generation using Stable Diffusion and ControlNet.

This model inherits from [StableDiffusionControlNetImg2ImgPipeline]. Check the superclass documentation for the generic methods implemented for all pipelines (downloading, saving, etc.).

It implements the methods to convert a pre-trained StableDiffusionControlNetImg2ImgPipeline into a RBLNStableDiffusionControlNetImg2ImgPipeline by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, rbln_img_width, rbln_img_height, controlnet, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace diffusers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id `Union[str, Path]`

Can be either:

  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
  • A path to a directory containing a model saved using save_pretrained,
required
rbln_img_width int

The width of the image to be generated.

required
rbln_img_height int

The height of the image to be generated.

required
controlnet ControlNetModel

Provides additional conditioning to the unet during the denoising process.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_guidance_scale float

The guidance scale value, which should be specified at compile time as it affects the input shape of the unet submodule in Stable Diffusion by determining whether classifier-free guidance is used. A higher value encourages the model to generate images closely linked to the text prompt, potentially at the expense of lower image quality.

5.0
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
__call__(prompt=None, image=None, control_image=None, strength=0.8, num_inference_steps=50, guidance_scale=7.5, negative_prompt=None, controlnet_conditioning_scale=0.8, generator=None)

The call function to the pipeline for generation.

Parameters:

Name Type Description Default
prompt Optional[Union[str, List[str]]]

The prompt or prompts to guide image generation. If not defined, you need to pass prompt_embeds.

None
image Optional[PipelineImageInput]

The initial image to be used as the starting point for the image generation process. Can also accept image latents as image, and if passing latents directly they are not encoded again.

None
control_image Optional[PipelineImageInput]

The ControlNet input condition to provide guidance to the unet for generation. If the type is specified as torch.Tensor, it is passed to ControlNet as is. PIL.Image.Image can also be accepted as an image. The dimensions of the output image defaults to image's dimensions. If height and/or width are passed, image is resized accordingly. If multiple ControlNets are specified in init, images must be passed as a list such that each element of the list can be correctly batched for input to a single ControlNet.

None
strength float

Indicates extent to transform the reference image. Must be between 0 and 1. image is used as a starting point and more noise is added the higher the strength. The number of denoising steps depends on the amount of noise initially added. When strength is 1, added noise is maximum and the denoising process runs for the full number of iterations specified in num_inference_steps. A value of 1 essentially ignores image.

0.8
num_inference_steps int

The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.

50
guidance_scale float

A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. Guidance scale is enabled when guidance_scale > 1.

7.5
negative_prompt Optional[Union[str, List[str]

The prompt or prompts to guide what to not include in image generation. If not defined, you need to pass negative_prompt_embeds instead. Ignored when not using guidance (guidance_scale < 1).

None
controlnet_conditioning_scale Optional[Union[float, List[float]

The outputs of the ControlNet are multiplied by controlnet_conditioning_scale before they are added to the residual in the original unet. If multiple ControlNets are specified in init, you can set the corresponding scale as a list.

0.8
generator Optional[Union[Generator, List[Generator]]]

A torch.Generator to make generation deterministic.

None

Audio

Classes

RBLNASTForAudioClassification

Audio Spectrogram Transformer model with an audio classification head on top (a linear layer on top of the pooled output) e.g. for datasets like AudioSet, Speech Commands v2. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models.

A class to convert and run pre-trained transformer-based ASTForAudioClassification models on RBLN devices. It implements the methods to convert a pre-trained transformers ASTForAudioClassification model into a RBLN transformer model by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.

Currently, this model class only supports the 'AST' model from the transformers library. Future updates may include support for additional model types.

Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNWav2Vec2ForCTC

Wav2Vec2 Model with a language modeling head on top for Connectionist Temporal Classification (CTC).

This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its model.

It implements the methods to convert a pre-trained Wav2Vec2ForCTC into a RBLNWav2Vec2ForCTC by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNWhisperForConditionalGeneration

The Whisper Model with a language modeling head. Can be used for automatic speech recognition.

This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models.

A class to convert and run pre-trained transformer-based WhisperForConditionalGeneration model on RBLN devices. It implements the methods to convert a pre-trained transformers WhisperForConditionalGeneration into a RBLNWhisperForConditionalGeneration by:

  • transferring the checkpoint weights of the original into an optimized RBLN graph,
  • compiling the resulting graph using the RBLN Compiler.
Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_token_timestamps=False) classmethod

The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_token_timestamps

A boolean flag to compile the model for generating word-level timestamps during inference.

False
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
generate(input_features, return_timestamps=None, task=None, language=None, is_multilingual=None, attention_mask=None, return_token_timestamps=None, return_segments=False, return_dict_in_generate=None)

The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.

Parameters:

Name Type Description Default
input_features Tensor

Float values of log-mel features extracted from the raw speech waveform.

required
return_timestamps Optional[bool]

Whether to return the timestamps with the text.

None
task Optional[str]

Task to use for generation, either "translate" or "transcribe". The model.config.forced_decoder_ids will be updated accordingly.

None
language Optional[Union[str, List[str]]]

Language token to use for generation, can be either in the form of <|en|>, en or english.

None
is_multilingual Optional[bool]

Whether or not the model is multilingual.

None
attention_mask Optional[Tensor]

attention_mask needs to be passed when doing long-form transcription using a batch size > 1.

None
return_token_timestamps Optional[bool]

Whether to return token-level timestamps with the text. This can be used with or without the return_timestamps option.

None
return_segments bool

Whether to additionally return a list of all segments. Note that this option can only be enabled when doing long-form transcription.

False
return_dict_in_generate Optional[bool]

Whether or not to return a [~utils.ModelOutput] instead of just returning the generated tokens. Note that when doing long-form transcription, return_dict_in_generate can only be enabled when return_segments is set to True. In this case the generation output of each segment is added to each segment.

None

Returns: torch.Tensor

Computer Vision

Classes

RBLNDPTForDepthEstimation

DPT Model with a depth estimation head on top (consisting of 3 convolutional layers). This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models.

Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_image_size=None) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_image_size Optional[Union[int, List[int]]]

The size of the image.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNResNetForImageClassification

ResNet Model with an image classification head on top (a linear layer on top of the pooled features), e.g. for ImageNet. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models.

Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_image_size=None) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_image_size Optional[Union[int, List[int]]]

The size of the image.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required
RBLNViTImageClassification

ViT Model transformer with an image classification head on top (a linear layer on top of the final hidden state of the [CLS] token) e.g. for ImageNet. This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models.

Functions
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_image_size=None) classmethod

The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.

Parameters:

Name Type Description Default
model_id Union[str, Path]

The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler.

required
export bool

A boolean flag to indicate if the model should be exported to a .rbln file.

False
rbln_npu Optional[str]

The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs.

None
rbln_device Optional[int]

The device to be used at runtime. If not specified, device 0 is used.

0
rbln_create_runtimes Optional[bool]

A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU.

None
rbln_batch_size Optional[int]

The batch size of the model.

1
rbln_image_size Optional[Union[int, List[int]]]

The size of the image.

None
save_pretrained(save_directory)

Saves a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained] class method.

Parameters:

Name Type Description Default
save_directory Union[str, PathLike]

The directory to save the model and its configuration files. Will be created if it doesn't exist.

required