Stable Diffusion ControlNet¶
ControlNet integrates conditional control into standard Stable Diffusion models (like v1.5), enabling image generation guided by auxiliary inputs such as canny edges, depth maps, or human pose estimations. Optimum RBLN provides accelerated pipelines for these models on RBLN NPUs.
Supported Pipelines¶
- Text-to-Image with ControlNet: Generate images from text prompts, guided by a control image.
- Image-to-Image with ControlNet: Modify an existing image based on a text prompt and a control image.
Key Classes¶
RBLNStableDiffusionControlNetPipeline
: Text-to-image pipeline with ControlNet guidance.RBLNStableDiffusionControlNetPipelineConfig
: Configuration for the text-to-image ControlNet pipeline.RBLNStableDiffusionControlNetImg2ImgPipeline
: Image-to-image pipeline with ControlNet guidance.RBLNStableDiffusionControlNetImg2ImgPipelineConfig
: Configuration for the image-to-image ControlNet pipeline.RBLNControlNetModel
: The RBLN-optimized ControlNet model itself.
Important: Batch Size Configuration for Guidance Scale¶
Batch Size and Guidance Scale
Similar to standard Stable Diffusion, using ControlNet pipelines with a guidance scale > 1.0 (the default is 7.5) doubles the effective batch size of both the UNet and the ControlNet model during runtime.
Ensure the batch_size
specified in the unet
and controlnet
sections of your RBLNStableDiffusionControlNetPipelineConfig
matches the expected runtime batch size (typically 2 × the inference batch size if guidance_scale > 1.0
).
Default Behavior¶
If you don't explicitly set batch sizes for unet
and controlnet
in the configuration, Optimum RBLN assumes guidance_scale > 1.0
and automatically sets their batch sizes to 2 × the pipeline's batch_size
.
API Reference¶
Classes¶
RBLNStableDiffusionControlNetPipeline
¶
Bases: RBLNDiffusionMixin
, StableDiffusionControlNetPipeline
Functions¶
from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
This method has two distinct operating modes:
- When
export=True
: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model - When
export=False
: Loads an already compiled RBLN model frommodel_id
without recompilation
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
str
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
False
|
model_save_dir
|
Optional[PathLike]
|
Directory to save the compiled model artifacts. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
lora_ids
|
Optional[Union[str, List[str]]]
|
LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused
into the model weights during compilation. Only used when |
None
|
lora_weights_names
|
Optional[Union[str, List[str]]]
|
Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when |
None
|
lora_scales
|
Optional[Union[float, List[float]]]
|
Scaling factor(s) to apply to the LoRA adapter(s). Only used when |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Returns:
Type | Description |
---|---|
Self
|
A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin. |
RBLNStableDiffusionControlNetImg2ImgPipeline
¶
Bases: RBLNDiffusionMixin
, StableDiffusionControlNetImg2ImgPipeline
Functions¶
from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
This method has two distinct operating modes:
- When
export=True
: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model - When
export=False
: Loads an already compiled RBLN model frommodel_id
without recompilation
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
str
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
False
|
model_save_dir
|
Optional[PathLike]
|
Directory to save the compiled model artifacts. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
lora_ids
|
Optional[Union[str, List[str]]]
|
LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused
into the model weights during compilation. Only used when |
None
|
lora_weights_names
|
Optional[Union[str, List[str]]]
|
Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when |
None
|
lora_scales
|
Optional[Union[float, List[float]]]
|
Scaling factor(s) to apply to the LoRA adapter(s). Only used when |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Returns:
Type | Description |
---|---|
Self
|
A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin. |
Classes¶
RBLNStableDiffusionControlNetPipelineBaseConfig
¶
Bases: RBLNModelConfig
Functions¶
__init__(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the text encoder component. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments. |
{}
|
Note
Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.
RBLNStableDiffusionControlNetPipelineConfig
¶
Bases: RBLNStableDiffusionControlNetPipelineBaseConfig
Functions¶
__init__(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the text encoder component. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments. |
{}
|
Note
Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.
RBLNStableDiffusionControlNetImg2ImgPipelineConfig
¶
Bases: RBLNStableDiffusionControlNetPipelineBaseConfig
Functions¶
__init__(text_encoder=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the text encoder component. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments. |
{}
|
Note
Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.