Model API¶
모델 API 문서는 명확한 이해를 위해 영문으로 작성되어 있습니다.
optimum API
¶
Generic model classes¶
Classes¶
RBLNBaseModel
¶
An abstract base class for compiling, loading, and saving neural network models from the huggingface transformers and diffusers libraries to run on RBLN NPU devices.
This class supports loading and saving models using the from_pretrained
and save_pretrained
methods,
similar to the huggingface libraries.
The from_pretrained
method loads a model corresponding to the given model_id
from a local repository
or the HuggingFace Hub onto the NPU. If the model is a PyTorch model and export=True
is passed as a
kwarg, it compiles the PyTorch model corresponding to the given model_id
before loading. If model_id
is an already rbln-compiled model, it can be directly loaded onto the NPU with export=False
.
rbln_npu
is a kwarg required for compilation, specifying the name of the NPU to be used. If this
keyword is not specified, the NPU installed on the host machine is used. If no NPU is installed on the
host machine, an error occurs.
rbln_device
specifies the device to be used at runtime. If not specified, device 0 is used.
rbln_create_runtimes
indicates whether to create runtime objects. If False, the runtime does not load
the model onto the NPU. This option is particularly useful when you want to perform compilation only on a
host machine without an NPU.
rbln_config
is a dictionary that allows passing configurations for the model and its submodules.
Any parameter prefixed with rbln_
in the from_pretrained
method is internally interpreted as a value
in rbln_config
.
For example, rbln_batch_size=4
is equivalent to passing rbln_config={"batch_size": 4}
.
Example usage of rbln_config
:
This is equivalent to:
Models compiled in this way can be saved to a local repository using save_pretrained
or uploaded to
the huggingface hub.
It also supports generation through generate
(for transformers models that support generation).
RBLNBaseModel is a class for models consisting of an arbitrary number of torch.nn.Module
s, and
therefore is an abstract class without explicit implementations of forward
or export
functions.
To inherit from this class, forward
, export
, etc. must be implemented.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_config=None, **kwargs)
classmethod
¶Load a pretrained model from a given model ID and optimize it for NPU execution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model or compiled model to be loaded. It can be downloaded from the HuggingFace model hub, a local path. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_config
|
Optional[Dict[str, Any]]
|
A dictionary containing configuration for the model and its submodules. This affects the compilation settings for both the main module and its submodules. **kwargs: Additional keyword arguments. Any argument prefixed with 'rbln_' will be treated as part of rbln_config. Arguments without the 'rbln_' prefix will be passed directly to the original Huggingface's from_pretrained method. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNModel
¶
A class that inherits from RBLNBaseModel
for models consisting of a single torch.nn.Module
.
This class supports all the functionality of RBLNBaseModel
, including loading and saving models using
the from_pretrained
and save_pretrained
methods, compiling PyTorch models for execution on RBLN NPU
devices.
Natural Language Processing¶
Classes¶
RBLNLlamaForCausalLM
¶
The Llama model transformer with a language modeling head (linear layer) on top.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained HuggingFace transformer-based LlamaForCausalLM
.
It implements the methods to convert a pre-trained transformers LlamaForCausalLM
into a RBLNLlamaForCausalLM
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[Union[int, List[int]]]
|
the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_tensor_parallel_size
|
Optional[int]
|
Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ ( |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNGemmaForCausalLM
¶
The Gemma model transformer with a language modeling head (linear layer) on top.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained HuggingFace transformer-based GemmaForCausalLM
.
It implements the methods to convert a pre-trained transformers GemmaForCausalLM
into a RBLNGemmaForCausalLM
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[Union[int, List[int]]]
|
the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_tensor_parallel_size
|
Optional[int]
|
Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ ( |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNMistralForCausalLM
¶
The Mistral model transformer with a language modeling head (linear layer) on top.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained HuggingFace transformer-based MistralForCausalLM
.
It implements the methods to convert a pre-trained transformers MistralForCausalLM
into a RBLNMistralForCausalLM
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
Users can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[Union[int, List[int]]]
|
the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_tensor_parallel_size
|
Optional[int]
|
Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ ( |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNQwen2ForCausalLM
¶
The Qwen2 Model transformer with a language modeling head (linear layer) on top.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained HuggingFace transformer-based Qwen2ForCausalLM
.
It implements the methods to convert a pre-trained transformers Qwen2ForCausalLM
into a RBLNQwen2ForCausalLM
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
Users can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[Union[int, List[int]]]
|
the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_tensor_parallel_size
|
Optional[int]
|
Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ ( |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNExaoneForCausalLM
¶
The EXAONE Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained HuggingFace transformer-based ExaoneForCausalLM
.
It implements the methods to convert a pre-trained transformers ExaoneForCausalLM
into a RBLNExaoneForCausalLM
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
Users can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[Union[int, List[int]]]
|
the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_tensor_parallel_size
|
Optional[int]
|
Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ ( |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNPhiForCausalLM
¶
The Phi model transformer with a language modeling head (linear layer) on top.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained HuggingFace transformer-based PhiForCausalLM
.
It implements the methods to convert a pre-trained transformers PhiForCausalLM
into a RBLNPhiForCausalLM
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the library.
Args:
model_id: The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler.
export: A boolean flag to indicate if the model should be exported to a .rbln
file.
rbln_npu: The name of the NPU to be used. If not specified, the NPU installed on the host machine is used.
If no NPU is installed on the host machine, an error occurs.
rbln_device: the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use.
If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs.
rbln_create_runtimes: A flag to indicate whether to create runtime objects. If False, the runtime does not load
the model onto the NPU. This option is particularly useful when you want to perform compilation only on a
host machine without an NPU.
rbln_batch_size: The batch size of the model.
rbln_max_seq_len: The maximum sequence length of the model.
rbln_tensor_parallel_size: Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ (RBLN-CA12
). You can check the type of your current RBLN NPU using the rbln-stat
command.
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model. Args: input_ids: The sequence used as a prompt for the generation attention_mask: The attention mask to apply on the sequence max_length: The maximum length of the sequence to be generated Returns: torch.Tensor
RBLNGPT2LMHeadModel
¶
The GPT2 Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the
library implements for all its model.
It implements the methods to convert a pre-trained GPT2LMHeadModel
into RBLNGPT2LMHeadModel
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model. Custom configuration is available like input_ids, attention_mask, max_length, etc.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNT5ForConditionalGeneration
¶
RBLN implementation of T5ForConditionalGeneration
, optimized for NPU execution.
This class provides an interface compatible with HuggingFace's T5ForConditionalGeneration
, but with RBLN-specific optimizations. It implements three key methods:
from_pretrained
: Loads a pre-trained T5 model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.generate
: Generates new text sequences based on input prompts, similar to the HuggingFace implementation.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_enc_max_seq_len=None, rbln_dec_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_enc_max_seq_len
|
Optional[int]
|
The maximum sequence length of the encoder. If not specified, model config's value is used. |
None
|
rbln_dec_max_seq_len
|
Optional[int]
|
The maximum sequence length of the decoder. If not specified, model config's value is used. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNBartForConditionalGeneration
¶
RBLN implementation for BART (Bidirectional and Auto-Regressive Transformers), optimized for NPU execution.
This class provides an interface compatible with HuggingFace's BartForConditionalGeneration
, but with RBLN-specific optimizations. It implements three key methods:
from_pretrained
: Loads a pre-trained BartForConditionalGeneration model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.generate
: Generates new text sequences based on input prompts, similar to the HuggingFace implementation.
Note: As of now, beam search in the generate
method is not supported.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_enc_max_seq_len=None, rbln_dec_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_enc_max_seq_len
|
Optional[int]
|
The maximum sequence length of the encoder. If not specified, model config's value is used. |
None
|
rbln_dec_max_seq_len
|
Optional[int]
|
The maximum sequence length of the decoder. If not specified, model config's value is used. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNBertForQuestionAnswering
¶
RBLN implementation for BertForQuestionAnswering
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's BertForQuestionAnswering
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedBertForQuestionAnswering
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNDistilBertForQuestionAnswering
¶
RBLN implementation for DistilBertForQuestionAnswering
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's DistilBertForQuestionAnswering
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedBertForQuestionAnswering
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNMidmLMHeadModel
¶
The Mi:dm model transformer with a language modeling head (linear layer) on top.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained HuggingFace transformer-based MidmLMHeadModel
.
It implements the methods to convert a pre-trained transformers MidmLMHeadModel
into a RBLNMidmLMHeadModel
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, trust_remote_code=True, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_tensor_parallel_size=1)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
trust_remote_code
|
bool
|
A boolean flag to allow or disallow the execution of custom code from the model repository.
If set to |
True
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[Union[int, List[int]]]
|
the device(s) to be used at runtime. If an integer is provided, it specifies the single device to use. If a list of integers is provided, it specifies the devices to use for tensor parallelism across multiple NPUs. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_tensor_parallel_size
|
Optional[int]
|
Compile and execute the model using multiple NPUs. This feature is only available on ATOM+ ( |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_ids, attention_mask=None, max_length=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
LongTensor
|
The sequence used as a prompt for the generation |
required |
attention_mask
|
Optional[Tensor]
|
The attention mask to apply on the sequence |
None
|
max_length
|
Optional[int]
|
The maximum length of the sequence to be generated |
None
|
Returns:
Type | Description |
---|---|
torch.Tensor |
RBLNRobertaForMaskedLM
¶
RBLN implementation for RobertaForMaskedLM
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's RobertaForMaskedLM
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedRobertaForMaskedLM
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNRobertaForSequenceClassification
¶
RBLN implementation for RobertaForSequenceClassification
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's RobertaForSequenceClassification
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedRobertaForSequenceClassification
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNXLMRobertaModel
¶
RBLN implementation for XLMRobertaModel
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's XLMRobertaModel
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedXLMRobertaModel
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNXLMRobertaForSequenceClassification
¶
RBLN implementation for XLMRobertaForSequenceClassification
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's XLMRobertaForSequenceClassification
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedXLMRobertaForSequenceClassification
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNBertModel
¶
RBLN implementation for BertModel
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's BertModel
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedBertModel
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_model_input_names=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_model_input_names
|
Optional[List[int]]
|
A list of inputs expected in the forward pass of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNBartModel
¶
RBLN implementation for BartModel
, optimized for execution on NPU devices.
This class provides an interface compatible with HuggingFace's BartModel
,
but with optimizations for RBLN NPUs. It implements two key methods:
from_pretrained
: Loads a pre-trainedBartModel
model and converts it into an optimized RBLN graph.save_pretrained
: Saves the compiled RBLN model for efficient reuse.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_max_seq_len=None, rbln_model_input_names=None)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_max_seq_len
|
Optional[int]
|
The maximum sequence length of the model. |
None
|
rbln_model_input_names
|
Optional[List[int]]
|
A list of inputs expected in the forward pass of the model. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
Multi Modal¶
Classes¶
RBLNLlavaNextForConditionalGeneration
¶
RBLNLlavaNextForConditionalGeneration is a multi-modal model that combines vision and language processing capabilities, optimized for RBLN NPUs. It is designed for conditional generation tasks that involve both image and text inputs.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
Important Note
This model includes a Large Language Model (LLM) as a submodule. For optimal performance, it is highly recommended to use
tensor parallelism for the language model. This can be achieved by using the rbln_config
parameter in the
from_pretrained
method. Here's an example of how to apply tensor parallelism:
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_vision_feature_select_strategy=None, rbln_config=None)
classmethod
¶Load a pretrained RBLNLlavaNextForConditionalGeneration model from a given model ID and optimize it for RBLN NPUs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub, a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_vision_feature_select_strategy
|
Optional[str]
|
Strategy for selecting vision features. If not specified, the default strategy from the model config is used. |
None
|
rbln_config
|
Optional[Dict[str, Any]]
|
A dictionary containing configurations for the main module and its submodules in RBLNLlavaNext. This is particularly important for applying tensor parallelism to the language model for optimal performance. Refer to the class docstring for an example of how to use this parameter. |
None
|
save_pretrained(save_directory)
¶Save a model and its configuration file to a directory, so that it can be re-loaded using the [from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
Directory to which to save. Will be created if it doesn't exist. |
required |
Stable Diffusion¶
Classes¶
RBLNStableDiffusionXLPipeline
¶
Pipeline for text-to-image generation using Stable Diffusion XL.
This model inherits from [StableDiffusionXLPipeline
]. Check the superclass documentation for the generic methods the
library implements for all the pipelines (such as downloading or saving, etc.)
It implements the methods to convert a pre-trained StableDiffusionXLPipeline
into a RBLNStableDiffusionXLPipeline
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace diffusers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`Union[str, Path]`
|
Can be either:
|
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_guidance_scale
|
float
|
The guidance scale value, which should be specified at compile time as it
affects the input shape of the |
5.0
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
__call__(prompt=None, num_inference_steps=50, guidance_scale=5.0, generator=None)
¶Function invoked when calling the pipeline for generation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt
|
Union[str, List[str]]
|
The prompt or prompts to guide the image generation. If not defined, one has to pass |
None
|
num_inference_steps
|
int
|
The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |
50
|
guidance_scale
|
float
|
Guidance scale as defined in Classifier-Free Diffusion Guidance.
|
5.0
|
generator
|
Optional[Union[Generator, List[Generator]]]
|
One or a list of torch generator(s) to make generation deterministic. |
None
|
Returns:
Type | Description |
---|---|
StableDiffusionPipelineOutput
|
Generated |
RBLNStableDiffusionXLImg2ImgPipeline
¶
Pipeline for image-to-image generation using Stable Diffusion XL.
This model inherits from [StableDiffusionXLPipeline
]. Check the superclass documentation for the generic methods the
library implements for all the pipelines (such as downloading or saving, etc.)
It implements the methods to convert a pre-trained StableDiffusionXLPipeline
into a RBLNStableDiffusionXLPipeline
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, rbln_img_width, rbln_img_height, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace diffusers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`Union[str, Path]`
|
Can be either:
|
required |
rbln_img_width
|
int
|
The width of the image to be generated. |
required |
rbln_img_height
|
int
|
The height of the image to be generated. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_guidance_scale
|
float
|
The guidance scale value, which should be specified at compile time as it
affects the input shape of the |
5.0
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
__call__(prompt=None, image=None, strength=0.8, num_inference_steps=50, guidance_scale=7.5, generator=None)
¶The call function to the pipeline for generation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt
|
Union[str, List[str]]
|
The prompt or prompts to guide image generation. If not defined, you need to pass |
None
|
image
|
PipelineImageInput
|
|
None
|
strength
|
`float`, *optional*, defaults to 0.8
|
Indicates extent to transform the reference |
0.8
|
num_inference_steps
|
int
|
The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |
50
|
guidance_scale
|
Optional[float]
|
A higher guidance scale value encourages the model to generate images closely linked to the text
|
7.5
|
generator
|
Optional[Union[Generator, List[Generator]]]
|
A |
None
|
Returns:
Type | Description |
---|---|
StableDiffusionPipelineOutput
|
Generated |
RBLNStableDiffusionPipeline
¶
Pipeline for text-to-image generation using Stable Diffusion.
This model inherits from [StableDiffusionPipeline
]. Check the superclass documentation for the generic methods
implemented for all pipelines (downloading, saving, etc.).
It implements the methods to convert a pre-trained StableDiffusionPipeline
into a RBLNStableDiffusionPipeline
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace diffusers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`Union[str, Path]`
|
Can be either:
|
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_guidance_scale
|
float
|
The guidance scale value, which should be specified at compile time as it
affects the input shape of the |
5.0
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
__call__(prompt=None, num_inference_steps=50, guidance_scale=7.5, generator=None)
¶The call function to the pipeline for generation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt
|
Union[str, List[str]]
|
The prompt or prompts to guide image generation. If not defined, you need to pass |
None
|
num_inference_steps
|
int
|
The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |
50
|
guidance_scale
|
float
|
A higher guidance scale value encourages the model to generate images closely linked to the text
|
7.5
|
generator
|
Optional[Union[Generator, List[Generator]]]
|
A |
None
|
Returns:
Type | Description |
---|---|
StableDiffusionPipelineOutput
|
Generated |
RBLNStableDiffusionImg2ImgPipeline
¶
Pipeline for image-to-image generation using Stable Diffusion.
This model inherits from [StableDiffusionPipeline
]. Check the superclass documentation for the generic methods
implemented for all pipelines (downloading, saving, etc.).
It implements the methods to convert a pre-trained StableDiffusionImg2ImgPipeline
into a RBLNStableDiffusionImg2ImgPipeline
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, rbln_img_width, rbln_img_height, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace diffusers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`Union[str, Path]`
|
Can be either:
|
required |
rbln_img_width
|
int
|
The width of the image to be generated. |
required |
rbln_img_height
|
int
|
The height of the image to be generated. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_guidance_scale
|
float
|
The guidance scale value, which should be specified at compile time as it
affects the input shape of the |
5.0
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
__call__(prompt=None, image=None, strength=0.8, num_inference_steps=50, guidance_scale=7.5, generator=None)
¶The call function to the pipeline for generation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt
|
Union[str, List[str]]
|
The prompt or prompts to guide image generation. If not defined, you need to pass |
None
|
image
|
PipelineImageInput
|
|
None
|
strength
|
float
|
Indicates extent to transform the reference |
0.8
|
num_inference_steps
|
Optional[int]
|
The number of denoising steps. More denoising steps usually lead to a higher quality image at the
expense of slower inference. This parameter is modulated by |
50
|
guidance_scale
|
Optional[float]
|
A higher guidance scale value encourages the model to generate images closely linked to the text
|
7.5
|
generator
|
Optional[Union[Generator, List[Generator]]]
|
A |
None
|
Returns:
Type | Description |
---|---|
StableDiffusionPipelineOutput
|
Generated |
RBLNStableDiffusionControlNetPipeline
¶
Pipeline for text-to-image generation using Stable Diffusion and ContorlNet.
This model inherits from [StableDiffusionControlNetPipeline
]. Check the superclass documentation for the generic methods
implemented for all pipelines (downloading, saving, etc.).
It implements the methods to convert a pre-trained StableDiffusionControlNetPipeline
into a RBLNStableDiffusionControlNetPipeline
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, rbln_img_width, rbln_img_height, controlnet, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace diffusers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`Union[str, Path]`
|
Can be either:
|
required |
rbln_img_width
|
int
|
The width of the image to be generated. |
required |
rbln_img_height
|
int
|
The height of the image to be generated. |
required |
controlnet
|
ControlNetModel
|
Provides additional conditioning to the |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_guidance_scale
|
float
|
The guidance scale value, which should be specified at compile time as it
affects the input shape of the |
5.0
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
__call__(prompt=None, image=None, num_inference_steps=50, guidance_scale=7.5, negative_prompt=None, controlnet_conditioning_scale=1.0, generator=None)
¶The call function to the pipeline for generation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt
|
Optional[Union[str, List[str]]]
|
The prompt or prompts to guide image generation. If not defined, you need to pass |
None
|
image
|
Optional[PipelineImageInput]
|
The ControlNet input condition to provide guidance to the |
None
|
num_inference_steps
|
int
|
The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |
50
|
guidance_scale
|
float
|
A higher guidance scale value encourages the model to generate images closely linked to the text
|
7.5
|
negative_prompt
|
Optional[Union[str, List[str]
|
The prompt or prompts to guide what to not include in image generation. If not defined, you need to
pass |
None
|
controlnet_conditioning_scale
|
Optional[Union[float, List[float]
|
The outputs of the ControlNet are multiplied by |
1.0
|
generator
|
Optional[Union[Generator, List[Generator]]]
|
A |
None
|
RBLNStableDiffusionControlNetImg2ImgPipeline
¶
Pipeline for image-to-image generation using Stable Diffusion and ControlNet.
This model inherits from [StableDiffusionControlNetImg2ImgPipeline
]. Check the superclass documentation for the generic methods
implemented for all pipelines (downloading, saving, etc.).
It implements the methods to convert a pre-trained StableDiffusionControlNetImg2ImgPipeline
into a RBLNStableDiffusionControlNetImg2ImgPipeline
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, rbln_img_width, rbln_img_height, controlnet, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_guidance_scale=5.0)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace diffusers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`Union[str, Path]`
|
Can be either:
|
required |
rbln_img_width
|
int
|
The width of the image to be generated. |
required |
rbln_img_height
|
int
|
The height of the image to be generated. |
required |
controlnet
|
ControlNetModel
|
Provides additional conditioning to the |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_guidance_scale
|
float
|
The guidance scale value, which should be specified at compile time as it
affects the input shape of the |
5.0
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
__call__(prompt=None, image=None, control_image=None, strength=0.8, num_inference_steps=50, guidance_scale=7.5, negative_prompt=None, controlnet_conditioning_scale=0.8, generator=None)
¶The call function to the pipeline for generation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt
|
Optional[Union[str, List[str]]]
|
The prompt or prompts to guide image generation. If not defined, you need to pass |
None
|
image
|
Optional[PipelineImageInput]
|
The initial image to be used as the starting point for the image generation process. Can also accept
image latents as |
None
|
control_image
|
Optional[PipelineImageInput]
|
The ControlNet input condition to provide guidance to the |
None
|
strength
|
float
|
Indicates extent to transform the reference |
0.8
|
num_inference_steps
|
int
|
The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |
50
|
guidance_scale
|
float
|
A higher guidance scale value encourages the model to generate images closely linked to the text
|
7.5
|
negative_prompt
|
Optional[Union[str, List[str]
|
The prompt or prompts to guide what to not include in image generation. If not defined, you need to
pass |
None
|
controlnet_conditioning_scale
|
Optional[Union[float, List[float]
|
The outputs of the ControlNet are multiplied by |
0.8
|
generator
|
Optional[Union[Generator, List[Generator]]]
|
A |
None
|
Audio¶
Classes¶
RBLNASTForAudioClassification
¶
Audio Spectrogram Transformer model with an audio classification head on top (a linear layer on top of the pooled output) e.g. for datasets like AudioSet, Speech Commands v2.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained transformer-based ASTForAudioClassification
models on RBLN devices.
It implements the methods to convert a pre-trained transformers ASTForAudioClassification
model into a RBLN transformer model by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Currently, this model class only supports the 'AST' model from the transformers library. Future updates may include support for additional model types.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNWav2Vec2ForCTC
¶
Wav2Vec2 Model with a language modeling
head on top for Connectionist Temporal Classification (CTC).
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the
library implements for all its model.
It implements the methods to convert a pre-trained Wav2Vec2ForCTC
into a RBLNWav2Vec2ForCTC
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNWhisperForConditionalGeneration
¶
The Whisper Model with a language modeling head. Can be used for automatic speech recognition.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained transformer-based WhisperForConditionalGeneration
model on RBLN devices.
It implements the methods to convert a pre-trained transformers WhisperForConditionalGeneration
into a RBLNWhisperForConditionalGeneration
by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN Compiler.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_token_timestamps=False)
classmethod
¶The from_pretrained function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_token_timestamps
|
A boolean flag to compile the model for generating word-level timestamps during inference. |
False
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
generate(input_features, return_timestamps=None, task=None, language=None, is_multilingual=None, attention_mask=None, return_token_timestamps=None, return_segments=False, return_dict_in_generate=None)
¶The generate function is utilized in its standard form as in the HuggingFace transformers library. User can use this function to generate text from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_features
|
Tensor
|
Float values of log-mel features extracted from the raw speech waveform. |
required |
return_timestamps
|
Optional[bool]
|
Whether to return the timestamps with the text. |
None
|
task
|
Optional[str]
|
Task to use for generation, either "translate" or "transcribe". The |
None
|
language
|
Optional[Union[str, List[str]]]
|
Language token to use for generation, can be either in the form of |
None
|
is_multilingual
|
Optional[bool]
|
Whether or not the model is multilingual. |
None
|
attention_mask
|
Optional[Tensor]
|
|
None
|
return_token_timestamps
|
Optional[bool]
|
Whether to return token-level timestamps with the text. This can be used with or without the
|
None
|
return_segments
|
bool
|
Whether to additionally return a list of all segments. Note that this option can only be enabled when doing long-form transcription. |
False
|
return_dict_in_generate
|
Optional[bool]
|
Whether or not to return a [ |
None
|
Returns: torch.Tensor
Computer Vision¶
Classes¶
RBLNDPTForDepthEstimation
¶
DPT Model with a depth estimation head on top (consisting of 3 convolutional layers).
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_image_size=None)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_image_size
|
Optional[Union[int, List[int]]]
|
The size of the image. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNResNetForImageClassification
¶
ResNet Model with an image classification head on top (a linear layer on top of the pooled features), e.g. for ImageNet.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_image_size=None)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_image_size
|
Optional[Union[int, List[int]]]
|
The size of the image. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |
RBLNViTImageClassification
¶
ViT Model transformer with an image classification head on top (a linear layer on top of the final hidden state of the [CLS] token) e.g. for ImageNet.
This model inherits from [RBLNModel
]. Check the superclass documentation for the generic methods the library implements for all its models.
Functions¶
from_pretrained(model_id, export=False, rbln_npu=None, rbln_device=0, rbln_create_runtimes=None, rbln_batch_size=1, rbln_image_size=None)
classmethod
¶The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN compiler. |
required |
export
|
bool
|
A boolean flag to indicate if the model should be exported to a |
False
|
rbln_npu
|
Optional[str]
|
The name of the NPU to be used. If not specified, the NPU installed on the host machine is used. If no NPU is installed on the host machine, an error occurs. |
None
|
rbln_device
|
Optional[int]
|
The device to be used at runtime. If not specified, device 0 is used. |
0
|
rbln_create_runtimes
|
Optional[bool]
|
A flag to indicate whether to create runtime objects. If False, the runtime does not load the model onto the NPU. This option is particularly useful when you want to perform compilation only on a host machine without an NPU. |
None
|
rbln_batch_size
|
Optional[int]
|
The batch size of the model. |
1
|
rbln_image_size
|
Optional[Union[int, List[int]]]
|
The size of the image. |
None
|
save_pretrained(save_directory)
¶Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, PathLike]
|
The directory to save the model and its configuration files. Will be created if it doesn't exist. |
required |