Stable Diffusion¶

Stable Diffusion은 텍스트 프롬프트로부터 이미지를 생성할 수 있는 텍스트-이미지 잠재 확산 모델입니다. RBLN NPU는 Optimum RBLN을 사용하여 Stable Diffusion 파이프라인을 가속화할 수 있습니다.

지원하는 파이프라인¶

Optimum RBLN은 여러 Stable Diffusion 파이프라인을 지원합니다:

텍스트-이미지 변환(Text-to-Image): 텍스트 프롬프트에서 이미지 생성
이미지-이미지 변환(Image-to-Image): 텍스트 프롬프트를 기반으로 기존 이미지 수정
인페인팅(Inpainting): 텍스트 프롬프트에 따라 이미지의 마스킹된 영역 채우기

주요 클래스¶

RBLNStableDiffusionPipeline: Stable Diffusion의 텍스트-이미지 파이프라인
RBLNStableDiffusionPipelineConfig: 텍스트-이미지 파이프라인 설정
RBLNStableDiffusionImg2ImgPipeline: Stable Diffusion의 이미지-이미지 파이프라인
RBLNStableDiffusionImg2ImgPipelineConfig: 이미지-이미지 파이프라인 설정
RBLNStableDiffusionInpaintPipeline: Stable Diffusion의 인페인팅 파이프라인
RBLNStableDiffusionInpaintPipelineConfig: 인페인팅 파이프라인 설정

중요: Guidance Scale에 따른 배치 크기 설정¶

배치 크기와 Guidance Scale

Stable Diffusion을 guidance scale > 1.0으로 사용할 때(기본값은 7.5), classifier-free guidance 기법으로 인해 UNet의 실제 배치 크기가 실행 시 2배가 됩니다.

RBLN NPU는 정적 그래프 컴파일을 사용하므로, 컴파일 시 UNet의 배치 크기가 실행 시 배치 크기와 일치해야 합니다. 그렇지 않으면 추론 중에 오류가 발생합니다.

기본 동작¶

UNet의 배치 크기를 명시적으로 지정하지 않는 경우, Optimum RBLN은 다음과 같이 동작합니다:

기본 guidance scale(7.5)을 사용한다고 가정합니다
자동으로 UNet의 배치 크기를 파이프라인 배치 크기의 2배로 설정합니다

기본 guidance scale(1.0보다 큰 값)을 사용할 계획이라면, 이 자동 구성이 올바르게 작동합니다. 그러나 다른 guidance scale을 사용하거나 더 많은 제어가 필요한 경우에는 UNet의 배치 크기를 명시적으로 구성해야 합니다.

예시: UNet 배치 크기 명시적 설정¶

from optimum.rbln import RBLNStableDiffusionPipelineConfig, RBLNStableDiffusionPipeline

# guidance_scale > 1.0인 경우 (기본값은 7.5)
# UNet의 배치 크기를 원하는 추론 배치 크기의 2배로 설정
config = RBLNStableDiffusionPipelineConfig(
    batch_size=2,  # 추론 배치 크기
    height=512,
    width=512,
    # UNet의 배치 크기를 2배로 구성
    unet=dict(batch_size=4)  # UNet 배치 크기를 2배로 설정
)

pipe = RBLNStableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    export=True,
    rbln_config=config
)

# 기본 guidance_scale=7.5로 일반 추론
prompts = ["A photo of an astronaut riding a horse on Mars", 
          "A portrait of a cat wearing a space suit"]
images = pipe(prompts).images  # 배치 크기 2

# 생성된 이미지 저장
for i, image in enumerate(images):
    image.save(f"생성된_이미지_{i}.png")
    print(f"이미지 {i}가 생성된_이미지_{i}.png로 저장되었습니다")

예시: Guidance Scale 1.0 사용¶

from optimum.rbln import RBLNStableDiffusionPipelineConfig, RBLNStableDiffusionPipeline

config = RBLNStableDiffusionPipelineConfig(
    height=512,
    width=512,
    # UNet 배치 크기가 추론 배치 크기와 일치
    unet=dict(batch_size=1) 
)

pipe = RBLNStableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    export=True,
    rbln_config=config
)

# 추론 시 반드시 guidance_scale=1.0 사용
prompt = "A photo of an astronaut riding a horse on Mars"
image = pipe(prompt, guidance_scale=1.0).images[0]

# 생성된 이미지 저장
image.save("가이던스_없는_이미지.png")
print("이미지가 가이던스_없는_이미지.png로 저장되었습니다")

사용 예제¶

from optimum.rbln import RBLNStableDiffusionPipeline, RBLNStableDiffusionPipelineConfig

# Create a configuration object with specific settings
config = RBLNStableDiffusionPipelineConfig(
    height=512,
    width=512
)

# Load and compile the model for RBLN NPU
pipe = RBLNStableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    export=True,
    rbln_config=config
)

# Generate an image
prompt = "A photo of an astronaut riding a horse on Mars"
image = pipe(prompt).images[0]

# 생성된 이미지 저장
image.save("우주비행사_화성.png")
print("이미지가 우주비행사_화성.png로 저장되었습니다")

API 참조¶

Classes¶

`RBLNStableDiffusionPipeline` ¶

Bases: RBLNDiffusionMixin, StableDiffusionPipeline

Functions¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.

This method has two distinct operating modes:

When export=True: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model
When export=False: Loads an already compiled RBLN model from model_id without recompilation

It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	The model ID or path to the pretrained model to load. Can be either: A model ID from the HuggingFace Hub A local path to a saved model directory	required
`export`	`bool`	If True, takes a PyTorch model from `model_id` and compiles it for RBLN NPU execution. If False, loads an already compiled RBLN model from `model_id` without recompilation.	`False`
`model_save_dir`	`Optional[PathLike]`	Directory to save the compiled model artifacts. Only used when `export=True`. If not provided and `export=True`, a temporary directory is used.	`None`
`rbln_config`	`Dict[str, Any]`	Configuration options for RBLN compilation. Can include settings for specific submodules such as `text_encoder`, `unet`, and `vae`. Configuration can be tailored to the specific pipeline being compiled.	`{}`
`lora_ids`	`Optional[Union[str, List[str]]]`	LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused into the model weights during compilation. Only used when `export=True`.	`None`
`lora_weights_names`	`Optional[Union[str, List[str]]]`	Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when `export=True`.	`None`
`lora_scales`	`Optional[Union[float, List[float]]]`	Scaling factor(s) to apply to the LoRA adapter(s). Only used when `export=True`.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used.	`{}`

Returns:

Type	Description
`Self`	A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin.

`RBLNStableDiffusionInpaintPipeline` ¶

Bases: RBLNDiffusionMixin, StableDiffusionInpaintPipeline

Functions¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.

This method has two distinct operating modes:

When export=True: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model
When export=False: Loads an already compiled RBLN model from model_id without recompilation

It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	The model ID or path to the pretrained model to load. Can be either: A model ID from the HuggingFace Hub A local path to a saved model directory	required
`export`	`bool`	If True, takes a PyTorch model from `model_id` and compiles it for RBLN NPU execution. If False, loads an already compiled RBLN model from `model_id` without recompilation.	`False`
`model_save_dir`	`Optional[PathLike]`	Directory to save the compiled model artifacts. Only used when `export=True`. If not provided and `export=True`, a temporary directory is used.	`None`
`rbln_config`	`Dict[str, Any]`	Configuration options for RBLN compilation. Can include settings for specific submodules such as `text_encoder`, `unet`, and `vae`. Configuration can be tailored to the specific pipeline being compiled.	`{}`
`lora_ids`	`Optional[Union[str, List[str]]]`	LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused into the model weights during compilation. Only used when `export=True`.	`None`
`lora_weights_names`	`Optional[Union[str, List[str]]]`	Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when `export=True`.	`None`
`lora_scales`	`Optional[Union[float, List[float]]]`	Scaling factor(s) to apply to the LoRA adapter(s). Only used when `export=True`.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used.	`{}`

Returns:

Type	Description
`Self`	A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin.

`RBLNStableDiffusionImg2ImgPipeline` ¶

Bases: RBLNDiffusionMixin, StableDiffusionImg2ImgPipeline

Functions¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.

This method has two distinct operating modes:

When export=True: Takes a PyTorch-based diffusion model, compiles it for RBLN NPUs, and loads the compiled model
When export=False: Loads an already compiled RBLN model from model_id without recompilation

It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	The model ID or path to the pretrained model to load. Can be either: A model ID from the HuggingFace Hub A local path to a saved model directory	required
`export`	`bool`	If True, takes a PyTorch model from `model_id` and compiles it for RBLN NPU execution. If False, loads an already compiled RBLN model from `model_id` without recompilation.	`False`
`model_save_dir`	`Optional[PathLike]`	Directory to save the compiled model artifacts. Only used when `export=True`. If not provided and `export=True`, a temporary directory is used.	`None`
`rbln_config`	`Dict[str, Any]`	Configuration options for RBLN compilation. Can include settings for specific submodules such as `text_encoder`, `unet`, and `vae`. Configuration can be tailored to the specific pipeline being compiled.	`{}`
`lora_ids`	`Optional[Union[str, List[str]]]`	LoRA adapter ID(s) to load and apply before compilation. LoRA weights are fused into the model weights during compilation. Only used when `export=True`.	`None`
`lora_weights_names`	`Optional[Union[str, List[str]]]`	Names of specific LoRA weight files to load, corresponding to lora_ids. Only used when `export=True`.	`None`
`lora_scales`	`Optional[Union[float, List[float]]]`	Scaling factor(s) to apply to the LoRA adapter(s). Only used when `export=True`.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used.	`{}`

Returns:

Type	Description
`Self`	A compiled diffusion pipeline that can be used for inference on RBLN NPU. The returned object is an instance of the class that called this method, inheriting from RBLNDiffusionMixin.

Classes¶

`RBLNStableDiffusionPipelineBaseConfig` ¶

Bases: RBLNModelConfig

Functions¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the text encoder component. Initialized as RBLNCLIPTextModelConfig if not provided.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component. Initialized as RBLNUNet2DConditionModelConfig if not provided.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component. Initialized as RBLNAutoencoderKLConfig if not provided.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions. Cannot be used together with img_height/img_width.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance. Deprecated parameter.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments passed to the parent RBLNModelConfig.	`{}`

Raises:

Type	Description
`ValueError`	If both image_size and img_height/img_width are provided.

Note

When guidance_scale > 1.0, the UNet batch size is automatically doubled to accommodate classifier-free guidance.

`RBLNStableDiffusionPipelineConfig` ¶

Bases: RBLNStableDiffusionPipelineBaseConfig

Functions¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the text encoder component. Initialized as RBLNCLIPTextModelConfig if not provided.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component. Initialized as RBLNUNet2DConditionModelConfig if not provided.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component. Initialized as RBLNAutoencoderKLConfig if not provided.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions. Cannot be used together with img_height/img_width.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance. Deprecated parameter.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments passed to the parent RBLNModelConfig.	`{}`

Raises:

Type	Description
`ValueError`	If both image_size and img_height/img_width are provided.

Note

When guidance_scale > 1.0, the UNet batch size is automatically doubled to accommodate classifier-free guidance.

`RBLNStableDiffusionImg2ImgPipelineConfig` ¶

Bases: RBLNStableDiffusionPipelineBaseConfig

Functions¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the text encoder component. Initialized as RBLNCLIPTextModelConfig if not provided.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component. Initialized as RBLNUNet2DConditionModelConfig if not provided.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component. Initialized as RBLNAutoencoderKLConfig if not provided.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions. Cannot be used together with img_height/img_width.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance. Deprecated parameter.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments passed to the parent RBLNModelConfig.	`{}`

Raises:

Type	Description
`ValueError`	If both image_size and img_height/img_width are provided.

Note

When guidance_scale > 1.0, the UNet batch size is automatically doubled to accommodate classifier-free guidance.

`RBLNStableDiffusionInpaintPipelineConfig` ¶

Bases: RBLNStableDiffusionPipelineBaseConfig

Functions¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`text_encoder`	`Optional[RBLNCLIPTextModelConfig]`	Configuration for the text encoder component. Initialized as RBLNCLIPTextModelConfig if not provided.	`None`
`unet`	`Optional[RBLNUNet2DConditionModelConfig]`	Configuration for the UNet model component. Initialized as RBLNUNet2DConditionModelConfig if not provided.	`None`
`vae`	`Optional[RBLNAutoencoderKLConfig]`	Configuration for the VAE model component. Initialized as RBLNAutoencoderKLConfig if not provided.	`None`
`batch_size`	`Optional[int]`	Batch size for inference, applied to all submodules.	`None`
`img_height`	`Optional[int]`	Height of the generated images.	`None`
`img_width`	`Optional[int]`	Width of the generated images.	`None`
`height`	`Optional[int]`	Height of the generated images.	`None`
`width`	`Optional[int]`	Width of the generated images.	`None`
`sample_size`	`Optional[Tuple[int, int]]`	Spatial dimensions for the UNet model.	`None`
`image_size`	`Optional[Tuple[int, int]]`	Alternative way to specify image dimensions. Cannot be used together with img_height/img_width.	`None`
`guidance_scale`	`Optional[float]`	Scale for classifier-free guidance. Deprecated parameter.	`None`
`**kwargs`	`Dict[str, Any]`	Additional arguments passed to the parent RBLNModelConfig.	`{}`

Raises:

Type	Description
`ValueError`	If both image_size and img_height/img_width are provided.

Note

When guidance_scale > 1.0, the UNet batch size is automatically doubled to accommodate classifier-free guidance.

Stable Diffusion¶

지원하는 파이프라인¶

주요 클래스¶

중요: Guidance Scale에 따른 배치 크기 설정¶

기본 동작¶

예시: UNet 배치 크기 명시적 설정¶

예시: Guidance Scale 1.0 사용¶

사용 예제¶

API 참조¶

Classes¶

RBLNStableDiffusionPipeline ¶

Functions¶

from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs) classmethod ¶

RBLNStableDiffusionInpaintPipeline ¶

Functions¶

from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs) classmethod ¶

RBLNStableDiffusionImg2ImgPipeline ¶

Functions¶

from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs) classmethod ¶

Classes¶

RBLNStableDiffusionPipelineBaseConfig ¶

Functions¶

__init__(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

RBLNStableDiffusionPipelineConfig ¶

Functions¶

__init__(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

RBLNStableDiffusionImg2ImgPipelineConfig ¶

Functions¶

__init__(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

RBLNStableDiffusionInpaintPipelineConfig ¶

Functions¶

__init__(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs) ¶

`RBLNStableDiffusionPipeline` ¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

`RBLNStableDiffusionInpaintPipeline` ¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

`RBLNStableDiffusionImg2ImgPipeline` ¶

`from_pretrained(model_id, *, export=False, model_save_dir=None, rbln_config={}, lora_ids=None, lora_weights_names=None, lora_scales=None, **kwargs)` `classmethod` ¶

`RBLNStableDiffusionPipelineBaseConfig` ¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

`RBLNStableDiffusionPipelineConfig` ¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

`RBLNStableDiffusionImg2ImgPipelineConfig` ¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶

`RBLNStableDiffusionInpaintPipelineConfig` ¶

`init(text_encoder=None, unet=None, vae=None, *, batch_size=None, img_height=None, img_width=None, height=None, width=None, sample_size=None, image_size=None, guidance_scale=None, **kwargs)` ¶