Stable Diffusion XL ControlNet¶
ControlNet은 더 진보된 Stable Diffusion XL (SDXL) 모델에도 적용될 수 있어, 조건 이미지로부터 정밀한 구조적 가이드라인을 받아 고해상도 이미지 생성을 가능하게 합니다. Optimum RBLN은 RBLN NPU를 위한 가속화된 SDXL ControlNet 파이프라인을 제공합니다.
지원하는 파이프라인¶
- SDXL ControlNet을 사용한 텍스트-이미지 변환: SDXL 기반 모델을 사용하여 제어 이미지로 가이드되는 텍스트 프롬프트에서 고해상도 이미지 생성.
- SDXL ControlNet을 사용한 이미지-이미지 변환: SDXL 기반 모델을 사용하여 텍스트 프롬프트와 제어 이미지를 기반으로 기존 이미지 수정.
주요 클래스¶
RBLNStableDiffusionXLControlNetPipeline
: ControlNet 가이드가 포함된 SDXL용 텍스트-이미지 파이프라인.RBLNStableDiffusionXLControlNetPipelineConfig
: 텍스트-이미지 SDXL ControlNet 파이프라인 설정.RBLNStableDiffusionXLControlNetImg2ImgPipeline
: ControlNet 가이드가 포함된 SDXL용 이미지-이미지 파이프라인.RBLNStableDiffusionXLControlNetImg2ImgPipelineConfig
: 이미지-이미지 SDXL ControlNet 파이프라인 설정.RBLNControlNetModel
: RBLN에 최적화된 ControlNet 모델 (SD 1.5 및 SDXL 모두 호환).
중요: Guidance Scale에 따른 배치 크기 설정¶
배치 크기와 Guidance Scale (SDXL)
다른 SDXL 파이프라인과 마찬가지로, guidance_scale > 1.0
으로 ControlNet SDXL 파이프라인을 사용하면 UNet 및 ControlNet 모델의 실제 배치 크기가 2배가 됩니다.
RBLNStableDiffusionXLControlNetPipelineConfig
의 unet
및 controlnet
섹션에 지정된 batch_size
가 예상 실행 시간 배치 크기(guidance_scale > 1.0
인 경우 일반적으로 추론 배치 크기의 2배)와 일치하는지 확인하십시오. 이를 생략하면 파이프라인의 batch_size
를 기준으로 자동으로 2배가 됩니다.
API 참조¶
Classes¶
RBLNStableDiffusionXLControlNetPipeline
¶
Bases: RBLNDiffusionMixin
, StableDiffusionXLControlNetPipeline
Functions¶
from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
This method has two distinct operating modes:
- When
export=True
: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model - When
export=False
: Loads an already compiled RBLN model frommodel_id
without recompilation
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
str
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
False
|
model_save_dir
|
Optional[PathLike]
|
Directory to save the compiled model artifacts. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
lora_ids
|
Optional[Union[str, List[str]]]
|
LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused
into the model weights during compilation. Only used when |
None
|
lora_weights_names
|
Optional[Union[str, List[str]]]
|
Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when |
None
|
lora_scales
|
Optional[Union[float, List[float]]]
|
Scaling factor(s) to apply to the LoRA adapter(s). Only used when |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Returns:
Type | Description |
---|---|
Self
|
A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin. |
RBLNStableDiffusionXLControlNetImg2ImgPipeline
¶
Bases: RBLNDiffusionMixin
, StableDiffusionXLControlNetImg2ImgPipeline
Functions¶
from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
This method has two distinct operating modes:
- When
export=True
: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model - When
export=False
: Loads an already compiled RBLN model frommodel_id
without recompilation
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
str
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
False
|
model_save_dir
|
Optional[PathLike]
|
Directory to save the compiled model artifacts. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
lora_ids
|
Optional[Union[str, List[str]]]
|
LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused
into the model weights during compilation. Only used when |
None
|
lora_weights_names
|
Optional[Union[str, List[str]]]
|
Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when |
None
|
lora_scales
|
Optional[Union[float, List[float]]]
|
Scaling factor(s) to apply to the LoRA adapter(s). Only used when |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Returns:
Type | Description |
---|---|
Self
|
A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin. |
Classes¶
RBLNStableDiffusionXLControlNetPipelineBaseConfig
¶
Bases: RBLNModelConfig
Functions¶
__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the primary text encoder. |
None
|
text_encoder_2
|
Optional[RBLNCLIPTextModelWithProjectionConfig]
|
Configuration for the secondary text encoder. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments. |
{}
|
Note
Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.
RBLNStableDiffusionXLControlNetPipelineConfig
¶
Bases: RBLNStableDiffusionXLControlNetPipelineBaseConfig
Functions¶
__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the primary text encoder. |
None
|
text_encoder_2
|
Optional[RBLNCLIPTextModelWithProjectionConfig]
|
Configuration for the secondary text encoder. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments. |
{}
|
Note
Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.
RBLNStableDiffusionXLControlNetImg2ImgPipelineConfig
¶
Bases: RBLNStableDiffusionXLControlNetPipelineBaseConfig
Functions¶
__init__(text_encoder=None, text_encoder_2=None, unet=None, vae=None, controlnet=None, *, batch_size=None, img_height=None, img_width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNCLIPTextModelConfig]
|
Configuration for the primary text encoder. |
None
|
text_encoder_2
|
Optional[RBLNCLIPTextModelWithProjectionConfig]
|
Configuration for the secondary text encoder. |
None
|
unet
|
Optional[RBLNUNet2DConditionModelConfig]
|
Configuration for the UNet model component. |
None
|
vae
|
Optional[RBLNAutoencoderKLConfig]
|
Configuration for the VAE model component. |
None
|
controlnet
|
Optional[RBLNControlNetModelConfig]
|
Configuration for the ControlNet model component. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
img_height
|
Optional[int]
|
Height of the generated images. |
None
|
img_width
|
Optional[int]
|
Width of the generated images. |
None
|
sample_size
|
Optional[Tuple[int, int]]
|
Spatial dimensions for the UNet model. |
None
|
image_size
|
Optional[Tuple[int, int]]
|
Alternative way to specify image dimensions. |
None
|
guidance_scale
|
Optional[float]
|
Scale for classifier-free guidance. |
None
|
**kwargs
|
Dict[str, Any]
|
Additional arguments. |
{}
|
Note
Guidance scale affects UNet and ControlNet batch sizes. If guidance_scale > 1.0, their batch sizes are doubled.