Stable Diffusion XL ControlNet¶

ControlNet can also be applied to the more advanced Stable Diffusion XL (SDXL) model, allowing for high-resolution image generation with precise structural guidance from condition images. Optimum RBLN provides accelerated SDXL ControlNet pipelines for RBLN NPUs.

Supported Pipelines¶

Text-to-Image with SDXL ControlNet: Generate high-resolution images from text prompts, guided by a control image using an SDXL base model.
Image-to-Image with SDXL ControlNet: Modify an existing image based on a text prompt and a control image using an SDXL base model.

Key Classes¶

RBLNStableDiffusionXLControlNetPipeline: Text-to-image pipeline for SDXL with ControlNet guidance.
RBLNStableDiffusionXLControlNetPipelineConfig: Configuration for the text-to-image SDXL ControlNet pipeline.
RBLNStableDiffusionXLControlNetImg2ImgPipeline: Image-to-image pipeline for SDXL with ControlNet guidance.
RBLNStableDiffusionXLControlNetImg2ImgPipelineConfig: Configuration for the image-to-image SDXL ControlNet pipeline.
RBLNControlNetModel: The RBLN-optimized ControlNet model (compatible with both SD 1.5 and SDXL).

Important: Batch Size Configuration for Guidance Scale¶

Batch Size and Guidance Scale (SDXL)

As with other SDXL pipelines, using ControlNet SDXL pipelines with guidance_scale > 1.0 doubles the effective batch size of the UNet and the ControlNet model.

Ensure the batch_size specified in the unet and controlnet sections of your RBLNStableDiffusionXLControlNetPipelineConfig matches the expected runtime batch size (typically 2 × the inference batch size if guidance_scale > 1.0). Omitting these will result in automatic doubling based on the pipeline's batch_size.

API Reference¶

Classes¶

`RBLNStableDiffusionXLControlNetPipeline` ¶

Bases: RBLNDiffusionMixin, StableDiffusionXLControlNetPipeline

Functions¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.

This method has two distinct operating modes:

When export=True: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model
When export=False: Loads an already compiled RBLN model from model_id without recompilation

It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	The model ID or path to the pretrained model to load. Can be either: A model ID from the HuggingFace Hub A local path to a saved model directory	required
`export`	`bool`	If True, takes a PyTorch model from `model_id` and compiles it for RBLN NPU execution. If False, loads an already compiled RBLN model from `model_id` without recompilation.	`False`
`model_save_dir`	`Optional[PathLike]`	Directory to save the compiled model artifacts. Only used when `export=True`. If not provided and `export=True`, a temporary directory is used.	`None`
`rbln_config`	`Dict[str, Any]`	Configuration options for RBLN compilation. Can include settings for specific submodules such as `text_encoder`, `unet`, and `vae`. Configuration can be tailored to the specific pipeline being compiled.	`{}`
`lora_ids`	`Optional[Union[str, List[str]]]`	LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused into the model weights during compilation. Only used when `export=True`.	`None`
`lora_weights_names`	`Optional[Union[str, List[str]]]`	Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when `export=True`.	`None`
`lora_scales`	`Optional[Union[float, List[float]]]`	Scaling factor(s) to apply to the LoRA adapter(s). Only used when `export=True`.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used.	`{}`

Returns:

Type	Description
`Self`	A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin.

`RBLNStableDiffusionXLControlNetImg2ImgPipeline` ¶

Bases: RBLNDiffusionMixin, StableDiffusionXLControlNetImg2ImgPipeline

Functions¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.

This method has two distinct operating modes:

When export=True: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model
When export=False: Loads an already compiled RBLN model from model_id without recompilation

It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	The model ID or path to the pretrained model to load. Can be either: A model ID from the HuggingFace Hub A local path to a saved model directory	required
`export`	`bool`	If True, takes a PyTorch model from `model_id` and compiles it for RBLN NPU execution. If False, loads an already compiled RBLN model from `model_id` without recompilation.	`False`
`model_save_dir`	`Optional[PathLike]`	Directory to save the compiled model artifacts. Only used when `export=True`. If not provided and `export=True`, a temporary directory is used.	`None`
`rbln_config`	`Dict[str, Any]`	Configuration options for RBLN compilation. Can include settings for specific submodules such as `text_encoder`, `unet`, and `vae`. Configuration can be tailored to the specific pipeline being compiled.	`{}`
`lora_ids`	`Optional[Union[str, List[str]]]`	LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused into the model weights during compilation. Only used when `export=True`.	`None`
`lora_weights_names`	`Optional[Union[str, List[str]]]`	Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when `export=True`.	`None`
`lora_scales`	`Optional[Union[float, List[float]]]`	Scaling factor(s) to apply to the LoRA adapter(s). Only used when `export=True`.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used.	`{}`

Returns:

Type	Description
`Self`	A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin.

Classes¶

`RBLNStableDiffusionXLControlNetPipelineBaseConfig` ¶

Bases: RBLNModelConfig

Functions¶

`init(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the primary text encoder.	`None`
`text_encoder_2`	`Optional[RBLNCLIPTextModelWithProjectionConfig]`	Configuration for the secondary text encoder.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component.	`None`
`controlnet`	`Optional[RBLNControlNetModelConfig]`	Configuration for the ControlNet model component.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments.	`{}`

Note

Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.

`RBLNStableDiffusionXLControlNetPipelineConfig` ¶

Bases: RBLNStableDiffusionXLControlNetPipelineBaseConfig

Functions¶

`init(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the primary text encoder.	`None`
`text_encoder_2`	`Optional[RBLNCLIPTextModelWithProjectionConfig]`	Configuration for the secondary text encoder.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component.	`None`
`controlnet`	`Optional[RBLNControlNetModelConfig]`	Configuration for the ControlNet model component.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments.	`{}`

Note

Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.

`RBLNStableDiffusionXLControlNetImg2ImgPipelineConfig` ¶

Bases: RBLNStableDiffusionXLControlNetPipelineBaseConfig

Functions¶

`init(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the primary text encoder.	`None`
`text_encoder_2`	`Optional[RBLNCLIPTextModelWithProjectionConfig]`	Configuration for the secondary text encoder.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component.	`None`
`controlnet`	`Optional[RBLNControlNetModelConfig]`	Configuration for the ControlNet model component.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments.	`{}`

Note

Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.

Stable Diffusion XL ControlNet¶

Supported Pipelines¶

Key Classes¶

Important: Batch Size Configuration for Guidance Scale¶

API Reference¶

Classes¶

RBLNStableDiffusionXLControlNetPipeline ¶

Functions¶

from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs) classmethod ¶

RBLNStableDiffusionXLControlNetImg2ImgPipeline ¶

Functions¶

from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs) classmethod ¶

Classes¶

RBLNStableDiffusionXLControlNetPipelineBaseConfig ¶

Functions¶

__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

RBLNStableDiffusionXLControlNetPipelineConfig ¶

Functions¶

__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

RBLNStableDiffusionXLControlNetImg2ImgPipelineConfig ¶

Functions¶

__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

`RBLNStableDiffusionXLControlNetPipeline` ¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

`RBLNStableDiffusionXLControlNetImg2ImgPipeline` ¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

`RBLNStableDiffusionXLControlNetPipelineBaseConfig` ¶

`init(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

`RBLNStableDiffusionXLControlNetPipelineConfig` ¶

`init(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

`RBLNStableDiffusionXLControlNetImg2ImgPipelineConfig` ¶

`init(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶