버케팅(Bucketing)¶

버케팅은 RBLN SDK에서 하나의 모델을 컴파일하여 여러 입력 형태를 효율적으로 지원할 수 있게 해주는 강력한 기능입니다. 각 입력 크기마다 별도의 모델이 필요한 기존 접근 방식과는 달리, 버케팅은 rebel-compiler가 하나의 컴파일된 모델 내에서 다양한 입력 차원을 처리할 수 있는 통합 런타임을 생성할 수 있게 합니다.

버케팅이란?¶

rebel-compiler는 기본적으로 정적 그래프 컴파일을 기반으로 하며, 일반적으로 컴파일 시점에 고정된 입력 형태가 필요합니다. 하지만 많은 실제 응용 프로그램에서는 다양한 크기의 입력을 처리해야 합니다. 예를 들어:

배치 처리 최적화를 위한 가변 배치 크기
컴퓨터 비전 작업을 위한 다양한 이미지 해상도
자연어 처리를 위한 동적 시퀀스 길이
객체 검출을 위한 다중 스케일 추론

버케팅은 컴파일 시점에 여러 입력 형태를 미리 정의할 수 있게 하여 이러한 문제를 해결합니다. 결과적으로 컴파일된 모델은 하나의 rebel.Runtime 인스턴스를 사용하여 런타임에 이러한 미리 정의된 형태들 사이를 효율적으로 전환할 수 있습니다.

버케팅 작동 원리¶

버케팅으로 모델을 컴파일할 때:

다중 입력 형태: 모델이 지원해야 하는 여러 입력 구성인 버켓(bucket)을 지정합니다
통합 컴파일: 컴파일러는 하나의 컴파일된 모델 내에서 각 버켓에 대한 최적화된 커널을 생성합니다
런타임 선택: 추론 중에 런타임이 입력 형태에 따라 적절한 버켓을 자동으로 선택합니다
효율적인 전환: 다른 입력 형태 간 전환 시 모델을 다시 로드할 필요가 없습니다

사전 준비사항¶

시작하기 전에 다음 패키지들이 설치되어 있는지 확인하세요:

기본 버케팅 예제¶

1단계: 모델 준비¶

여러 배치 크기를 지원하는 간단한 ResNet50 예제부터 시작하겠습니다:

import torch
import torchvision.models as models
import rebel

# 사전 훈련된 ResNet50 모델 로드
model = models.resnet50(pretrained=True).eval()

2단계: 다중 입력 형태(버켓) 정의¶

# 지원하고자 하는 입력 형태 정의
image_size = 224
supported_batch_sizes = [1, 2, 4, 8]  # 다양한 배치 크기
input_infos = []

# 각 배치 크기에 대한 입력 정보 생성
for batch_size in supported_batch_sizes:
    input_info = [("input", [batch_size, 3, image_size, image_size], "float32")]
    input_infos.append(input_info)

print("정의된 bucket들:")
for i, info in enumerate(input_infos):
    print(f"  Bucket {i}: {info[0][1]}")

3단계: 버케팅으로 모델 컴파일¶

# 다중 입력 형태로 모델 컴파일
compiled_model = rebel.compile_from_torch(
    model,
    input_info=input_infos  # 버케팅을 위한 input_info 리스트 전달
)

# 컴파일된 모델 저장 (선택사항)
compiled_model.save("resnet50_bucketed.rbln")

4단계: 런타임 생성 및 추론 수행¶

# 모든 버켓을 지원하는 단일 런타임 생성
runtime = rebel.Runtime(compiled_model, tensor_type="pt")

# 다양한 배치 크기로 테스트
test_batch_sizes = [1, 2, 4, 8]

for batch_size in test_batch_sizes:
    print(f"\n배치 크기 {batch_size}로 테스트:")

    # 현재 배치 크기로 랜덤 입력 생성
    test_input = torch.randn(batch_size, 3, 224, 224)

    # 추론 실행 - 런타임이 적절한 버켓을 자동으로 선택
    output = runtime(test_input)

    print(f"  입력 형태: {test_input.shape}")
    print(f"  출력 형태: {output.shape}")
    print(f"  예측된 클래스: {torch.argmax(output, axis=1)}")

고급 버케팅 예제¶

가변 이미지 크기¶

다양한 이미지 해상도에 대한 버켓을 생성할 수도 있습니다:

import torch
import torchvision.models as models
import rebel

# 모델 로드
model = models.efficientnet_b0(pretrained=True).eval()

# 다양한 이미지 크기와 배치 크기 정의
configurations = [
    (1, 3, 224, 224),  # 표준 ImageNet 크기
    (1, 3, 256, 256),  # 약간 큰 크기
    (1, 3, 288, 288),  # 더 큰 크기
    (2, 3, 224, 224),  # 표준 크기의 배치 2
    (4, 3, 224, 224),  # 표준 크기의 배치 4
]

input_infos = []
for batch, channels, height, width in configurations:
    input_info = [("input", [batch, channels, height, width], "float32")]
    input_infos.append(input_info)

# 모든 구성으로 컴파일
compiled_model = rebel.compile_from_torch(
    model,
    input_info=input_infos
)

runtime = rebel.Runtime(compiled_model, tensor_type="pt")

# 다양한 입력 크기로 테스트
def test_inference(batch_size, height, width):
    # 랜덤 입력 생성
    test_input = torch.randn(batch_size, 3, height, width)

    # 추론 실행
    output = runtime(test_input)
    print(f"입력: {test_input.shape} -> 출력: {output.shape}")

# 다양한 구성 테스트
test_inference(1, 224, 224)
test_inference(1, 256, 256)
test_inference(1, 288, 288)
test_inference(2, 224, 224)
test_inference(4, 224, 224)

버켓 선택 전략¶

유연성과 성능의 균형을 고려하여 버켓을 선택하세요:

# 좋은 예: 배치 크기에 2의 거듭제곱 사용
batch_sizes = [1, 2, 4, 8]

# 좋은 예: 일반적인 이미지 크기
image_sizes = [224, 256, 288, 320, 416, 640]

# 피해야 할 예: 너무 많은 버켓 (컴파일 시간과 모델 크기 증가)
# batch_sizes = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]

주요 고려사항: - 예상되는 사용 사례와 일치하는 일반적인 입력 해상도 선택 - 적은 수의 버켓으로 시작하고 필요에 따라 추가 - 유연성과 컴파일 시간 간의 트레이드오프 고려

전체 코드: 동적 배칭을 사용한 이미지 분류¶

이미지 분류에서 버케팅 기법을 적용한 예제의 전체 코드는 아래와 같습니다:

image_classification_bucketing.py
#!/usr/bin/env python3
"""
Complete Image Classification Example with Bucketing

This example demonstrates how to create a bucketed image classifier that
supports multiple batch sizes efficiently.
"""

import torch
import torchvision.models as models

import rebel


def create_bucketed_classifier():
    """Create a bucketed image classifier"""

    # Load pre-trained ResNet50
    model = models.resnet50(pretrained=True).eval()

    # Define buckets for different batch sizes
    batch_sizes = [1, 2, 4]
    input_infos = []

    for batch_size in batch_sizes:
        input_info = [("input", [batch_size, 3, 224, 224], "float32")]
        input_infos.append(input_info)

    compiled_model = rebel.compile_from_torch(model, input_info=input_infos)

    return compiled_model


# Example usage
def main():
    # Create bucketed model
    compiled_model = create_bucketed_classifier()
    runtime = rebel.Runtime(compiled_model, tensor_type="pt")

    # Example with different batch sizes
    batch_sizes = [1, 2, 4]
    for i, batch_size in enumerate(batch_sizes):
        print(f"\nTest case {i + 1}: Batch size {batch_size}")

        dummy_input = torch.randn(batch_size, 3, 224, 224)
        outputs = runtime(dummy_input)
        predictions = torch.argmax(outputs, axis=1)
        print(f"Predictions: {predictions.tolist()}")


if __name__ == "__main__":
    main()

결론¶

버케팅은 RBLN SDK에서 가변 입력 형태를 효율적으로 처리할 수 있게 해주는 강력한 기능입니다. 다음과 같은 이점을 제공합니다:

유연성: 하나의 런타임 인스턴스로 다중 입력 형태 지원
효율성: 모든 지원되는 구성에 대해 하나의 런타임 인스턴스 사용

이 튜토리얼에서 소개하는 예제와 전략을 통해, RBLN SDK에서 버케팅을 효과적으로 적용하여 더 유연하고 효율적인 추론 파이프라인을 구축할 수 있습니다.