Cosmos¶
Cosmos World Foundation Models는 물리 법칙을 정확하게 반영하는 영상과 가상 세계의 상태를 생성하는 데 특화되어 있습니다. 디퓨전 기술을 기반으로 텍스트, 이미지 등 다양한 입력을 활용해 역동적이고 품질 높은 영상을 만들어냅니다. 따라서 가상 세계 생성과 관련된 여러 연구 및 응용 분야의 핵심적인 기반 기술로 사용됩니다. RBLN NPU는 Optimum RBLN을 사용하여 Cosmos 파이프라인을 가속화할 수 있습니다.
지원하는 파이프라인¶
Optimum RBLN은 여러 Cosmos 파이프라인을 지원합니다:
- 텍스트-비디오 변환(Text-to-Video): 텍스트 프롬프트에서 고품질 비디오 생성.
- 비디오-비디오 변환(Video-to-Video): 비디오와 텍스트 프롬프트를 기반으로 고품질 비디오 생성.
중요: Cosmos 가드레일 모델¶
NVIDIA Open Model License
NVIDIA Open Model License 정책에 따라, 임의로 귀하가 Cosmos의 가드레일 모델 기능을 우회, 비활성화, 약화시키거나 다른 방식으로 회피하는 경우 귀하의 권리는 자동으로 종료됩니다.
사용 예제 (텍스트-비디오)¶
사용 예제 (비디오-비디오)¶
API 참조¶
Classes¶
RBLNCosmosTextToWorldPipeline
¶
Bases: RBLNDiffusionMixin
, CosmosTextToWorldPipeline
RBLN-accelerated implementation of Cosmos Text to World pipeline for text-to-video generation.
This pipeline compiles Cosmos Text to World models to run efficiently on RBLN NPUs, enabling high-performance inference for generating videos with distinctive artistic style and enhanced visual quality.
Functions¶
from_pretrained(model_id, *, export=False, safety_checker=None, rbln_config={}, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`str`
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
False
|
safety_checker
|
Optional[RBLNCosmosSafetyChecker]
|
Optional custom safety checker to use instead of the default one. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
kwargs
|
Any
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Functions¶
Classes¶
RBLNCosmosVideoToWorldPipeline
¶
Bases: RBLNDiffusionMixin
, CosmosVideoToWorldPipeline
RBLN-accelerated implementation of Cosmos Video to World pipeline for video-to-video generation.
This pipeline compiles Cosmos Video to World models to run efficiently on RBLN NPUs, enabling high-performance inference for generating videos with distinctive artistic style and enhanced visual quality.
Functions¶
from_pretrained(model_id, *, export=False, safety_checker=None, rbln_config={}, **kwargs)
classmethod
¶
Load a pretrained diffusion pipeline from a model checkpoint, with optional compilation for RBLN NPUs.
It supports various diffusion pipelines including Stable Diffusion, Kandinsky, ControlNet, and other diffusers-based models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
`str`
|
The model ID or path to the pretrained model to load. Can be either:
|
required |
export
|
bool
|
If True, takes a PyTorch model from |
False
|
safety_checker
|
Optional[RBLNCosmosSafetyChecker]
|
Optional custom safety checker to use instead of the default one. Only used when |
None
|
rbln_config
|
Dict[str, Any]
|
Configuration options for RBLN compilation. Can include settings for specific submodules
such as |
{}
|
kwargs
|
Any
|
Additional arguments to pass to the underlying diffusion pipeline constructor or the RBLN compilation process. These may include parameters specific to individual submodules or the particular diffusion pipeline being used. |
{}
|
Functions¶
Classes¶
RBLNCosmosPipelineBaseConfig
¶
Bases: RBLNModelConfig
Functions¶
__init__(text_encoder=None, transformer=None, vae=None, safety_checker=None, *, batch_size=None, height=None, width=None, num_frames=None, fps=None, max_seq_len=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNT5EncoderModelConfig]
|
Configuration for the text encoder component. Initialized as RBLNT5EncoderModelConfig if not provided. |
None
|
transformer
|
Optional[RBLNCosmosTransformer3DModelConfig]
|
Configuration for the Transformer model component. Initialized as RBLNCosmosTransformer3DModelConfig if not provided. |
None
|
vae
|
Optional[RBLNAutoencoderKLCosmosConfig]
|
Configuration for the VAE model component. Initialized as RBLNAutoencoderKLCosmosConfig if not provided. |
None
|
safety_checker
|
Optional[RBLNCosmosSafetyCheckerConfig]
|
Configuration for the safety checker component. Initialized as RBLNCosmosSafetyCheckerConfig if not provided. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
height
|
Optional[int]
|
Height of the generated videos. |
None
|
width
|
Optional[int]
|
Width of the generated videos. |
None
|
num_frames
|
Optional[int]
|
The number of frames in the generated video. |
None
|
fps
|
Optional[int]
|
The frames per second of the generated video. |
None
|
max_seq_len
|
Optional[int]
|
Maximum sequence length supported by the model. |
None
|
kwargs
|
Any
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.
RBLNCosmosTextToWorldPipelineConfig
¶
Bases: RBLNCosmosPipelineBaseConfig
Config for Cosmos Text2World Pipeline
Functions¶
__init__(text_encoder=None, transformer=None, vae=None, safety_checker=None, *, batch_size=None, height=None, width=None, num_frames=None, fps=None, max_seq_len=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNT5EncoderModelConfig]
|
Configuration for the text encoder component. Initialized as RBLNT5EncoderModelConfig if not provided. |
None
|
transformer
|
Optional[RBLNCosmosTransformer3DModelConfig]
|
Configuration for the Transformer model component. Initialized as RBLNCosmosTransformer3DModelConfig if not provided. |
None
|
vae
|
Optional[RBLNAutoencoderKLCosmosConfig]
|
Configuration for the VAE model component. Initialized as RBLNAutoencoderKLCosmosConfig if not provided. |
None
|
safety_checker
|
Optional[RBLNCosmosSafetyCheckerConfig]
|
Configuration for the safety checker component. Initialized as RBLNCosmosSafetyCheckerConfig if not provided. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
height
|
Optional[int]
|
Height of the generated videos. |
None
|
width
|
Optional[int]
|
Width of the generated videos. |
None
|
num_frames
|
Optional[int]
|
The number of frames in the generated video. |
None
|
fps
|
Optional[int]
|
The frames per second of the generated video. |
None
|
max_seq_len
|
Optional[int]
|
Maximum sequence length supported by the model. |
None
|
kwargs
|
Any
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.
RBLNCosmosVideoToWorldPipelineConfig
¶
Bases: RBLNCosmosPipelineBaseConfig
Config for Cosmos Video2World Pipeline
Functions¶
__init__(text_encoder=None, transformer=None, vae=None, safety_checker=None, *, batch_size=None, height=None, width=None, num_frames=None, fps=None, max_seq_len=None, **kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_encoder
|
Optional[RBLNT5EncoderModelConfig]
|
Configuration for the text encoder component. Initialized as RBLNT5EncoderModelConfig if not provided. |
None
|
transformer
|
Optional[RBLNCosmosTransformer3DModelConfig]
|
Configuration for the Transformer model component. Initialized as RBLNCosmosTransformer3DModelConfig if not provided. |
None
|
vae
|
Optional[RBLNAutoencoderKLCosmosConfig]
|
Configuration for the VAE model component. Initialized as RBLNAutoencoderKLCosmosConfig if not provided. |
None
|
safety_checker
|
Optional[RBLNCosmosSafetyCheckerConfig]
|
Configuration for the safety checker component. Initialized as RBLNCosmosSafetyCheckerConfig if not provided. |
None
|
batch_size
|
Optional[int]
|
Batch size for inference, applied to all submodules. |
None
|
height
|
Optional[int]
|
Height of the generated videos. |
None
|
width
|
Optional[int]
|
Width of the generated videos. |
None
|
num_frames
|
Optional[int]
|
The number of frames in the generated video. |
None
|
fps
|
Optional[int]
|
The frames per second of the generated video. |
None
|
max_seq_len
|
Optional[int]
|
Maximum sequence length supported by the model. |
None
|
kwargs
|
Any
|
Additional arguments passed to the parent RBLNModelConfig. |
{}
|
load(path, **kwargs)
classmethod
¶
Load a RBLNModelConfig from a path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the RBLNModelConfig file or directory containing the config file. |
required |
kwargs
|
Any
|
Additional keyword arguments to override configuration values. Keys starting with 'rbln_' will have the prefix removed and be used to update the configuration. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
RBLNModelConfig |
RBLNModelConfig
|
The loaded configuration instance. |
Note
This method loads the configuration from the specified path and applies any provided overrides. If the loaded configuration class doesn't match the expected class, a warning will be logged.