Stable Diffusion XL ControlNet¶
ControlNet can also be applied to the more advanced Stable Diffusion XL (SDXL) model, allowing for high-resolution image generation with precise structural guidance from condition images. Optimum RBLN provides accelerated SDXL ControlNet pipelines for RBLN NPUs.
Supported Pipelines¶
- Text-to-Image with SDXL ControlNet: Generate high-resolution images from text prompts, guided by a control image using an SDXL base model.
- Image-to-Image with SDXL ControlNet: Modify an existing image based on a text prompt and a control image using an SDXL base model.
Important: Batch Size Configuration for Guidance Scale¶
Batch Size and Guidance Scale (SDXL)
As with other SDXL pipelines, using ControlNet SDXL pipelines with guidance_scale > 1.0
doubles the effective batch size of the UNet and the ControlNet model.
Ensure the batch_size
specified in the unet
and controlnet
sections of your RBLNStableDiffusionXLControlNetPipelineConfig
matches the expected runtime batch size (typically 2 × the inference batch size if guidance_scale > 1.0
). Omitting these will result in automatic doubling based on the pipeline's batch_size
.
API Reference¶
Classes¶
RBLNStableDiffusionXLControlNetPipeline
¶
Bases: RBLNDiffusionMixin
, StableDiffusionXLControlNetPipeline
RBLN-accelerated implementation of Stable Diffusion XL pipeline with ControlNet for high-resolution guided text-to-image generation.
This pipeline compiles Stable Diffusion XL and ControlNet models to run efficiently on RBLN NPUs, enabling high-performance inference for generating high-quality images with precise structural control and enhanced detail preservation.
Functions¶
from_pretrained(model_id, *, export=None, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`str`
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
None
|
model_save_dir
|
Optional[PathLike]
|
Directory to save the compiled model artifacts. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
lora_ids
|
Optional[Union[str, List[str]]]
|
LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused
into the model weights during compilation. Only used when |
None
|
lora_weights_names
|
Optional[Union[str, List[str]]]
|
Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when |
None
|
lora_scales
|
Optional[Union[float, List[float]]]
|
Scaling factor(s) to apply to the LoRA adapter(s). Only used when |
None
|
kwargs
|
Any
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Returns:
Type | Description |
---|---|
RBLNDiffusionMixin
|
A compiled or loaded diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin. |
Functions¶
Classes¶
RBLNStableDiffusionXLControlNetImg2ImgPipeline
¶
Bases: RBLNDiffusionMixin
, StableDiffusionXLControlNetImg2ImgPipeline
RBLN-accelerated implementation of Stable Diffusion XL pipeline with ControlNet for high-resolution guided image-to-image generation.
This pipeline compiles Stable Diffusion XL and ControlNet models to run efficiently on RBLN NPUs, enabling high-performance inference for transforming input images with precise structural control and enhanced quality preservation.
Functions¶
from_pretrained(model_id, *, export=None, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`str`
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
None
|
model_save_dir
|
Optional[PathLike]
|
Directory to save the compiled model artifacts. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
lora_ids
|
Optional[Union[str, List[str]]]
|
LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused
into the model weights during compilation. Only used when |
None
|
lora_weights_names
|
Optional[Union[str, List[str]]]
|
Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when |
None
|
lora_scales
|
Optional[Union[float, List[float]]]
|
Scaling factor(s) to apply to the LoRA adapter(s). Only used when |
None
|
kwargs
|
Any
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Returns:
Type | Description |
---|---|
RBLNDiffusionMixin
|
A compiled or loaded diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin. |
Functions¶
Classes¶
RBLNStableDiffusionXLControlNetPipelineBaseConfig
¶
Bases: RBLNModelConfig
Base configuration for Stable Diffusion XL ControlNet pipelines.
Functions¶
__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the primary text encoder. Initialized as RBLNCLIPTextModelConfig if not provided. |
None
|
text_encoder_2
|
Optional[RBLNCLIPTextModelWithProjectionConfig]
|
Configuration for the secondary text encoder. Initialized as RBLNCLIPTextModelWithProjectionConfig if not provided. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. Initialized as RBLNUNet2DConditionModelConfig if not provided. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. Initialized as RBLNAutoencoderKLConfig if not provided. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. Initialized as RBLNControlNetModelConfig if not provided. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
height
|
Optional[int]
|
Height of the generated images. |
None
|
width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. Cannot be used together with img_height/img_width. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
kwargs
|
Any
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If both image_size and img_height/img_width are provided. |
Note
When guidance_scale > 1.0, the UNet batch size is automatically doubled to accommodate classifier-free guidance.
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.
RBLNStableDiffusionXLControlNetPipelineConfig
¶
Bases: RBLNStableDiffusionXLControlNetPipelineBaseConfig
Configuration for Stable Diffusion XL ControlNet pipeline.
Functions¶
__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the primary text encoder. Initialized as RBLNCLIPTextModelConfig if not provided. |
None
|
text_encoder_2
|
Optional[RBLNCLIPTextModelWithProjectionConfig]
|
Configuration for the secondary text encoder. Initialized as RBLNCLIPTextModelWithProjectionConfig if not provided. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. Initialized as RBLNUNet2DConditionModelConfig if not provided. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. Initialized as RBLNAutoencoderKLConfig if not provided. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. Initialized as RBLNControlNetModelConfig if not provided. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
height
|
Optional[int]
|
Height of the generated images. |
None
|
width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. Cannot be used together with img_height/img_width. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
kwargs
|
Any
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If both image_size and img_height/img_width are provided. |
Note
When guidance_scale > 1.0, the UNet batch size is automatically doubled to accommodate classifier-free guidance.
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.
RBLNStableDiffusionXLControlNetImg2ImgPipelineConfig
¶
Bases: RBLNStableDiffusionXLControlNetPipelineBaseConfig
Configuration for Stable Diffusion XL ControlNet image-to-image pipeline.
Functions¶
__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the primary text encoder. Initialized as RBLNCLIPTextModelConfig if not provided. |
None
|
text_encoder_2
|
Optional[RBLNCLIPTextModelWithProjectionConfig]
|
Configuration for the secondary text encoder. Initialized as RBLNCLIPTextModelWithProjectionConfig if not provided. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. Initialized as RBLNUNet2DConditionModelConfig if not provided. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. Initialized as RBLNAutoencoderKLConfig if not provided. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. Initialized as RBLNControlNetModelConfig if not provided. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
height
|
Optional[int]
|
Height of the generated images. |
None
|
width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. Cannot be used together with img_height/img_width. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
kwargs
|
Any
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If both image_size and img_height/img_width are provided. |
Note
When guidance_scale > 1.0, the UNet batch size is automatically doubled to accommodate classifier-free guidance.
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.