ColQwen2¶
The ColQwen2 is a Vision Language Model (VLM) that uses a novel architecture and training strategy to efficiently index documents from their visual features. RBLN NPUs can accelerate ColQwen2 model inference using Optimum RBLN.
API Reference¶
Classes¶
RBLNColQwen2ForRetrieval
¶
Bases: RBLNModel
RBLNColQwen2ForRetrieval is a model for document retrieval using vision-language models.
This model inherits from [RBLNModel]. Check the superclass documentation for the generic methods the library implements for all its models.
A class to convert and run pre-trained transformers based ColQwen2ForRetrieval model on RBLN devices.
It implements the methods to convert a pre-trained transformers ColQwen2ForRetrieval model into a RBLN transformer model by:
- transferring the checkpoint weights of the original into an optimized RBLN graph,
- compiling the resulting graph using the RBLN compiler.
Examples:
Functions¶
from_model(model, config=None, rbln_config=None, model_save_dir=None, subfolder='', **kwargs)
classmethod
¶
Converts and compiles a pre-trained HuggingFace library model into a RBLN model. This method performs the actual model conversion and compilation process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
PreTrainedModel
|
The PyTorch model to be compiled. The object must be an instance of the HuggingFace transformers PreTrainedModel class. |
required |
config
|
Optional[PretrainedConfig]
|
The configuration object associated with the model. |
None
|
rbln_config
|
Optional[Union[RBLNModelConfig, Dict]]
|
Configuration for RBLN model compilation and runtime.
This can be provided as a dictionary or an instance of the model's configuration class (e.g., |
None
|
kwargs
|
Any
|
Additional keyword arguments. Arguments with the prefix |
{}
|
The method performs the following steps:
- Compiles the PyTorch model into an optimized RBLN graph
- Configures the model for the specified NPU device
- Creates the necessary runtime objects if requested
- Saves the compiled model and configurations
Returns:
| Type | Description |
|---|---|
RBLNModel
|
A RBLN model instance ready for inference on RBLN NPU devices. |
forward(input_ids=None, inputs_embeds=None, attention_mask=None, pixel_values=None, image_grid_thw=None, output_hidden_states=None, return_dict=None, **kwargs)
¶
Runs a ColQwen2 retrieval forward pass on text tokens and optional visual inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_ids
|
LongTensor
|
Indices of the textual tokens. Mutually exclusive with |
None
|
inputs_embeds
|
FloatTensor
|
Pre-computed embeddings fed directly into the language model. |
None
|
attention_mask
|
Tensor
|
Mask that selects which token positions contribute to the loss/embeddings. |
None
|
pixel_values
|
Tensor
|
Flattened image patches produced by |
None
|
image_grid_thw
|
LongTensor
|
Per-image |
None
|
output_hidden_states
|
bool
|
If |
None
|
return_dict
|
bool
|
If |
None
|
**kwargs
|
dict[str, Any]
|
Extra multimodal args forwarded to the wrapped VLM (e.g. |
{}
|
Returns:
| Type | Description |
|---|---|
Union[Tuple, ColQwen2ForRetrievalOutput]
|
Dataclass containing the embeddings and hidden states of the VLM model. |
from_pretrained(model_id, export=None, rbln_config=None, **kwargs)
classmethod
¶
The from_pretrained() function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the HuggingFace library and convert it to a RBLN model to be run on RBLN NPUs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
Optional[bool]
|
A boolean flag to indicate whether the model should be compiled. If None, it will be determined based on the existence of the compiled model files in the model_id. |
None
|
rbln_config
|
Optional[Union[Dict, RBLNModelConfig]]
|
Configuration for RBLN model compilation and runtime.
This can be provided as a dictionary or an instance of the model's configuration class (e.g., |
None
|
kwargs
|
Any
|
Additional keyword arguments. Arguments with the prefix |
{}
|
Returns:
| Type | Description |
|---|---|
RBLNModel
|
A RBLN model instance ready for inference on RBLN NPU devices. |
save_pretrained(save_directory, push_to_hub=False, **kwargs)
¶
Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[~optimum.rbln.modeling_base.RBLNBaseModel.from_pretrained] class method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
save_directory
|
Union[str, Path]
|
Directory where to save the model file. |
required |
push_to_hub
|
bool
|
Whether or not to push your model to the HuggingFace model hub after saving it. |
False
|
Classes¶
RBLNColQwen2ForRetrievalConfig
¶
Bases: RBLNDecoderOnlyModelConfig
Configuration class for RBLN ColQwen2 models for document retrieval.
This class extends RBLNModelConfig with specific configurations for ColQwen2 models, including vision tower settings and multi-sequence length support.
Example usage
Functions¶
__init__(batch_size=None, output_hidden_states=None, vlm=None, **kwargs)
¶
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch_size
|
Optional[int]
|
The batch size for the model. |
None
|
output_hidden_states
|
Optional[bool]
|
Whether to output the hidden states of the VLM model. |
None
|
vlm
|
Optional[RBLNModelConfig]
|
Configuration for the VLM component. |
None
|
kwargs
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
Raises: ValueError: If batch_size is not a positive integer.
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.