Stable Diffusion ControlNet¶

ControlNet integrates conditional control into standard Stable Diffusion models (like v1.5), enabling image generation guided by auxiliary inputs such as canny edges, depth maps, or human pose estimations. Optimum RBLN provides accelerated pipelines for these models on RBLN NPUs.

Supported Pipelines¶

Text-to-Image with ControlNet: Generate images from text prompts, guided by a control image.
Image-to-Image with ControlNet: Modify an existing image based on a text prompt and a control image.

Key Classes¶

RBLNStableDiffusionControlNetPipeline: Text-to-image pipeline with ControlNet guidance.
RBLNStableDiffusionControlNetPipelineConfig: Configuration for the text-to-image ControlNet pipeline.
RBLNStableDiffusionControlNetImg2ImgPipeline: Image-to-image pipeline with ControlNet guidance.
RBLNStableDiffusionControlNetImg2ImgPipelineConfig: Configuration for the image-to-image ControlNet pipeline.
RBLNControlNetModel: The RBLN-optimized ControlNet model itself.

Important: Batch Size Configuration for Guidance Scale¶

Batch Size and Guidance Scale

Similar to standard Stable Diffusion, using ControlNet pipelines with a guidance scale > 1.0 (the default is 7.5) doubles the effective batch size of both the UNet and the ControlNet model during runtime.

Ensure the batch_size specified in the unet and controlnet sections of your RBLNStableDiffusionControlNetPipelineConfig matches the expected runtime batch size (typically 2 × the inference batch size if guidance_scale > 1.0).

Default Behavior¶

If you don't explicitly set batch sizes for unet and controlnet in the configuration, Optimum RBLN assumes guidance_scale > 1.0 and automatically sets their batch sizes to 2 × the pipeline's batch_size.

API Reference¶

Classes¶

`RBLNStableDiffusionControlNetPipeline` ¶

Bases: RBLNDiffusionMixin, StableDiffusionControlNetPipeline

Functions¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.

This method has two distinct operating modes:

When export=True: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model
When export=False: Loads an already compiled RBLN model from model_id without recompilation

It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	The model ID or path to the pretrained model to load. Can be either: A model ID from the HuggingFace Hub A local path to a saved model directory	required
`export`	`bool`	If True, takes a PyTorch model from `model_id` and compiles it for RBLN NPU execution. If False, loads an already compiled RBLN model from `model_id` without recompilation.	`False`
`model_save_dir`	`Optional[PathLike]`	Directory to save the compiled model artifacts. Only used when `export=True`. If not provided and `export=True`, a temporary directory is used.	`None`
`rbln_config`	`Dict[str, Any]`	Configuration options for RBLN compilation. Can include settings for specific submodules such as `text_encoder`, `unet`, and `vae`. Configuration can be tailored to the specific pipeline being compiled.	`{}`
`lora_ids`	`Optional[Union[str, List[str]]]`	LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused into the model weights during compilation. Only used when `export=True`.	`None`
`lora_weights_names`	`Optional[Union[str, List[str]]]`	Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when `export=True`.	`None`
`lora_scales`	`Optional[Union[float, List[float]]]`	Scaling factor(s) to apply to the LoRA adapter(s). Only used when `export=True`.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used.	`{}`

Returns:

Type	Description
`Self`	A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin.

`RBLNStableDiffusionControlNetImg2ImgPipeline` ¶

Bases: RBLNDiffusionMixin, StableDiffusionControlNetImg2ImgPipeline

Functions¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.

This method has two distinct operating modes:

When export=True: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model
When export=False: Loads an already compiled RBLN model from model_id without recompilation

It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	The model ID or path to the pretrained model to load. Can be either: A model ID from the HuggingFace Hub A local path to a saved model directory	required
`export`	`bool`	If True, takes a PyTorch model from `model_id` and compiles it for RBLN NPU execution. If False, loads an already compiled RBLN model from `model_id` without recompilation.	`False`
`model_save_dir`	`Optional[PathLike]`	Directory to save the compiled model artifacts. Only used when `export=True`. If not provided and `export=True`, a temporary directory is used.	`None`
`rbln_config`	`Dict[str, Any]`	Configuration options for RBLN compilation. Can include settings for specific submodules such as `text_encoder`, `unet`, and `vae`. Configuration can be tailored to the specific pipeline being compiled.	`{}`
`lora_ids`	`Optional[Union[str, List[str]]]`	LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused into the model weights during compilation. Only used when `export=True`.	`None`
`lora_weights_names`	`Optional[Union[str, List[str]]]`	Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when `export=True`.	`None`
`lora_scales`	`Optional[Union[float, List[float]]]`	Scaling factor(s) to apply to the LoRA adapter(s). Only used when `export=True`.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used.	`{}`

Returns:

Type	Description
`Self`	A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin.

Classes¶

`RBLNStableDiffusionControlNetPipelineBaseConfig` ¶

Bases: RBLNModelConfig

Functions¶

`init(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the text encoder component.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component.	`None`
`controlnet`	`Optional[RBLNControlNetModelConfig]`	Configuration for the ControlNet model component.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments.	`{}`

Note

Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.

`RBLNStableDiffusionControlNetPipelineConfig` ¶

Bases: RBLNStableDiffusionControlNetPipelineBaseConfig

Functions¶

`init(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the text encoder component.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component.	`None`
`controlnet`	`Optional[RBLNControlNetModelConfig]`	Configuration for the ControlNet model component.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments.	`{}`

Note

Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.

`RBLNStableDiffusionControlNetImg2ImgPipelineConfig` ¶

Bases: RBLNStableDiffusionControlNetPipelineBaseConfig

Functions¶

`init(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the text encoder component.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component.	`None`
`controlnet`	`Optional[RBLNControlNetModelConfig]`	Configuration for the ControlNet model component.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments.	`{}`

Note

Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.

Stable Diffusion ControlNet¶

Supported Pipelines¶

Key Classes¶

Important: Batch Size Configuration for Guidance Scale¶

Default Behavior¶

API Reference¶

Classes¶

RBLNStableDiffusionControlNetPipeline ¶

Functions¶

from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs) classmethod ¶

RBLNStableDiffusionControlNetImg2ImgPipeline ¶

Functions¶

from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs) classmethod ¶

Classes¶

RBLNStableDiffusionControlNetPipelineBaseConfig ¶

Functions¶

__init__(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

RBLNStableDiffusionControlNetPipelineConfig ¶

Functions¶

__init__(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

RBLNStableDiffusionControlNetImg2ImgPipelineConfig ¶

Functions¶

__init__(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

`RBLNStableDiffusionControlNetPipeline` ¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

`RBLNStableDiffusionControlNetImg2ImgPipeline` ¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

`RBLNStableDiffusionControlNetPipelineBaseConfig` ¶

`init(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

`RBLNStableDiffusionControlNetPipelineConfig` ¶

`init(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

`RBLNStableDiffusionControlNetImg2ImgPipelineConfig` ¶

`init(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶