Stable Diffusion 3용 멀티모달 확산 트랜스포머 (MMDiT)¶
RBLNSD3Transformer2DModel
은 Stable Diffusion 3 모델에서 사용되는 핵심 트랜스포머 블록의 RBLN 최적화 버전입니다.
이 모델은 이전 Stable Diffusion 버전에서 사용된 UNet 아키텍처를 대체합니다. 여러 텍스트 인코더의 풀링된 임베딩 및 타임스텝 정보와 함께 잠재 이미지 표현을 처리하여 확산 프로세스를 수행합니다.
파이프라인 내 사용법¶
일반적으로 RBLNSD3Transformer2DModel
과 직접 상호 작용하지 않습니다. 대신, RBLNStableDiffusion3Pipeline
과 같은 RBLN Stable Diffusion 3 파이프라인의 일부로 자동 로드 및 관리됩니다.
RBLN SD3 파이프라인을 구성할 때 파이프라인 설정 객체의 transformer
인수를 통해 트랜스포머에 대한 특정 설정을 전달할 수 있습니다:
트랜스포머의 배치 크기에 대한 guidance scale 영향 처리를 포함한 파이프라인 사용 및 구성에 대한 자세한 내용은 Stable Diffusion 3 파이프라인 문서를 참조하십시오.
API 참조¶
Classes¶
RBLNSD3Transformer2DModel
¶
Bases: RBLNModel
RBLN implementation of SD3Transformer2DModel for diffusion models like Stable Diffusion 3.
The SD3Transformer2DModel takes text and/or image embeddings from encoders (like CLIP) and maps them to a shared latent space that guides the diffusion process to generate the desired image.
This class inherits from [RBLNModel
]. Check the superclass documentation for the generic methods
the library implements for all its models.
Functions¶
from_model(model, config=None, rbln_config=None, model_save_dir=None, subfolder='', **kwargs)
classmethod
¶
Converts and compiles a pre-trained HuggingFace library model into a RBLN model. This method performs the actual model conversion and compilation process.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
PreTrainedModel
|
The PyTorch model to be compiled. The object must be an instance of the HuggingFace transformers PreTrainedModel class. |
required |
config
|
Optional[PretrainedConfig]
|
The configuration object associated with the model. |
None
|
rbln_config
|
Optional[Union[RBLNModelConfig, Dict]]
|
Configuration for RBLN model compilation and runtime.
This can be provided as a dictionary or an instance of the model's configuration class (e.g., |
None
|
kwargs
|
Any
|
Additional keyword arguments. Arguments with the prefix |
{}
|
The method performs the following steps:
- Compiles the PyTorch model into an optimized RBLN graph
- Configures the model for the specified NPU device
- Creates the necessary runtime objects if requested
- Saves the compiled model and configurations
Returns:
Type | Description |
---|---|
RBLNModel
|
A RBLN model instance ready for inference on RBLN NPU devices. |
from_pretrained(model_id, export=None, rbln_config=None, **kwargs)
classmethod
¶
The from_pretrained()
function is utilized in its standard form as in the HuggingFace transformers library.
User can use this function to load a pre-trained model from the HuggingFace library and convert it to a RBLN model to be run on RBLN NPUs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
Union[str, Path]
|
The model id of the pre-trained model to be loaded. It can be downloaded from the HuggingFace model hub or a local path, or a model id of a compiled model using the RBLN Compiler. |
required |
export
|
Optional[bool]
|
A boolean flag to indicate whether the model should be compiled. If None, it will be determined based on the existence of the compiled model files in the model_id. |
None
|
rbln_config
|
Optional[Union[Dict, RBLNModelConfig]]
|
Configuration for RBLN model compilation and runtime.
This can be provided as a dictionary or an instance of the model's configuration class (e.g., |
None
|
kwargs
|
Any
|
Additional keyword arguments. Arguments with the prefix |
{}
|
Returns:
Type | Description |
---|---|
RBLNModel
|
A RBLN model instance ready for inference on RBLN NPU devices. |
save_pretrained(save_directory, push_to_hub=False, **kwargs)
¶
Saves a model and its configuration file to a directory, so that it can be re-loaded using the
[~optimum.rbln.modeling_base.RBLNBaseModel.from_pretrained
] class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_directory
|
Union[str, Path]
|
Directory where to save the model file. |
required |
push_to_hub
|
bool
|
Whether or not to push your model to the HuggingFace model hub after saving it. |
False
|
Functions¶
Classes¶
RBLNSD3Transformer2DModelConfig
¶
Bases: RBLNModelConfig
Configuration class for RBLN Stable Diffusion 3 Transformer models.
This class inherits from RBLNModelConfig and provides specific configuration options for Transformer models used in diffusion models like Stable Diffusion 3.
Functions¶
__init__(batch_size=None, sample_size=None, prompt_embed_length=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
Optional[int]
|
The batch size for inference. Defaults to 1. |
None
|
sample_size
|
Optional[Union[int, Tuple[int, int]]]
|
The spatial dimensions (height, width) of the generated samples. If an integer is provided, it's used for both height and width. |
None
|
prompt_embed_length
|
Optional[int]
|
The length of the embedded prompt vectors that will be used to condition the transformer model. |
None
|
kwargs
|
Any
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If batch_size is not a positive integer. |
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.