파이토치 토치비전 `ResNet50`¶

파이토치는 가장 많이 쓰이는 딥러닝 프레임워크 중 하나이며, 파이토치의 확장 라이브러리인 토치비전은 다양한 사전 훈련된 모델과 데이터셋을 제공합니다.

이 튜토리얼에서는 토치비전에서 제공하는 ResNet50 모델(이미지 분류)을 컴파일하고 RBLN NPU로 추론하는 방법을 배울 수 있습니다.

이 튜토리얼은 두 단계로 구성되어 있습니다:

파이토치 ResNet50 모델을 컴파일하고 로컬 저장소에 저장하는 방법
컴파일된 모델을 로드하고 추론하는 방법

사전 준비¶

시작하기에 앞서 아래의 파이썬 패키지들이 설치되어 있는지 확인합니다:

참고

세부 사항을 생략하고, 빠르게 컴파일 및 추론하는 방법을 확인하려면 요약을 참고하세요. 컴파일 및 추론을 위한 모든 코드가 정리되어있어 빠르게 프로젝트를 시작 할 수 있습니다.

네이티브 RBLN API¶

1단계. 컴파일 방법¶

모델 준비¶

토치비전 라이브러리를 통해 ResNet50 모델을 로드합니다.

import torch
from torchvision.models import resnet50, ResNet50_Weights
import rebel  # RBLN 컴파일러

# 토치비전 ResNet50 모델 준비
weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=weights)
model.eval()

모델 컴파일¶

rebel.compile_from_torch() 함수를 이용하여 준비 된 파이토치 모델(torch.nn.Module)을 컴파일 할 수 있습니다.

# 모델 컴파일
compiled_model = rebel.compile_from_torch(
    model,
    [("input", [1, 3, 224, 224], torch.float32)],
    # 호스트에 NPU가 설치되어 있는 경우, 아래의 `npu`인자는 명시하지 않아도 자동으로 감지됩니다. 
    npu="RBLN-CA12",
)

호스트 머신에 NPU가 설치되어 있는 경우, NPU를 자동으로 감지하여 사용하기 때문에 rebel.compile_from_torch 함수에 npu 인자를 생략할 수 있습니다. NPU가 설치되지 않은 호스트 머신에서 컴파일을 수행할 경우 컴파일 함수에 npu 인자를 명시해야 합니다. 그렇지 않으면 에러가 발생합니다.

현재 지원하는 NPU는 RBLN-CA02, RBLN-CA12입니다. 사용하려는 NPU의 이름은 NPU가 설치된 호스트 머신에서 rbln-stat 명령어를 통해 확인할 수 있습니다.

컴파일된 모델 저장¶

compiled_model.save() 함수를 통해 컴파일된 모델을 로컬 저장소에 저장할 수 있습니다.

# 컴파일된 모델 로컬 저장소에 저장
compiled_model.save("resnet50.rbln")

2단계. 추론 방법¶

RBLN 런타임 모듈 rebel.Runtime()을 통해 이전 단계에서 컴파일된 모델을 로드하고 추론할 수 있습니다.

입력 데이터 준비¶

ResNet50 모델의 입력으로 사용될 사전처리 된(pre-processed) 이미지를 준비합니다. 토치비전에서 제공하는 torchvision.io.image.read_image() 함수로 이미지를 로드하고, 토치비전에서 제공하는 ResNet50_Weights.DEFAULT.transforms() 함수로 ResNet50모델을 학습할 때 사용했던 사전처리 과정을 수행합니다.

import torch
from torchvision.io.image import read_image
from torchvision.models import resnet50, ResNet50_Weights
import urllib.request
import rebel  # RBLN 런타임

# 입력 데이터 준비
img_url = "https://rbln-public.s3.ap-northeast-2.amazonaws.com/images/tabby.jpg"
img_path = "./tabby.jpg"
with urllib.request.urlopen(img_url) as response, open(img_path, "wb") as f:
    f.write(response.read())
img = read_image(img_path)
weights = ResNet50_Weights.DEFAULT
preprocess = weights.transforms()
batch = preprocess(img).unsqueeze(0)

추론¶

RBLN 런타임 모듈 rebel.Runtime()은 컴파일된 모델을 로드하는 데 사용됩니다. 아래와 같은 두 가지 방법으로 생성할 수 있습니다.

# 컴파일된 모델의 경로
module = rebel.Runtime("resnet50.rbln", tensor_type="pt")

# 컴파일된 모델 인스턴스
module = rebel.Runtime(compiled_model, tensor_type="pt")

tensor_type 인자는 입력과 출력 데이터에 사용될 텐서의 유형을 지정합니다. 파이토치 텐서를 사용할 경우 "pt"로, NumPy 배열을 사용할 경우 "np"로 설정할 수 있습니다.

인스턴스화된 런타임 모듈 rebel.Runtime()의 run() 함수를 사용하여 추론을 실행할 수 있습니다. 또한, forward() 함수와 __call__ 매직 함수도 추론을 실행하는 데 사용할 수 있으며, 이를 통해 파이토치와의 호환성을 유지할 수 있습니다.

# `run()` 함수를 사용하여 추론 실행하기
rebel_result = module.run(batch)

# 대신, `forward()` 함수 사용하기
rebel_result = module.forward(batch)

# 또는 `__call__` 매직 함수 사용하기
rebel_result = module(batch)

forward() 또는 __call__을 사용하면 로드된 RBLN 모델을 파이토치 모델과 동일한 방식으로 사용할 수 있어, 기존 파이토치 코드와 원활하게 통합할 수 있습니다.

print(module)을 통해 로드 된 모델의 입출력 형태 및 모델크기 등 요약된 정보를 확인할 수 있습니다.

결과 확인¶

추론 결과 rebel_result는 이미지넷 데이터셋의 각 카테고리별 확률을 담고 있는 (1, 1000) 사이즈의 파이토치 텐서입니다. torch.topk() 함수를 사용하여 가장 확률이 높은 카테고리의 인덱스와 확률값을 얻을 수 있습니다. ResNet50_Weights.DEFAULT.meta["categories"]를 통해 카테고리의 이름을 확인할 수 있습니다.

# 결과 확인
score, class_id = torch.topk(rebel_result, 1, dim=1)
category_name = weights.meta["categories"][class_id]
print("Top1 category: ", category_name)

최종 결과는 다음과 같습니다:

1	`Top1 category: tabby`

요약¶

토치비전에서 제공하는 ResNet50 모델의 컴파일을 위한 완성된 코드는 아래와 같습니다:

import torch
from torchvision.models import resnet50, ResNet50_Weights
import rebel  # RBLN 컴파일러

# 토치비전 ResNet50 모델 준비
weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=weights)
model.eval()

# 모델 컴파일
compiled_model = rebel.compile_from_torch(
    model,
    [("input", [1, 3, 224, 224], torch.float32)],
    # 호스트에 NPU가 설치되어 있는 경우, 아래의 `npu`인자는 명시하지 않아도 자동으로 감지됩니다. 
    npu="RBLN-CA12",
)

# 컴파일된 모델 로컬 저장소에 저장
compiled_model.save("resnet50.rbln")

컴파일된 ResNet50 모델의 추론을 위한 완성된 코드는 아래와 같습니다:

import torch
from torchvision.io.image import read_image
from torchvision.models import resnet50, ResNet50_Weights
import urllib.request
import rebel  # RBLN 런타임

# 입력 데이터 준비
img_url = "https://rbln-public.s3.ap-northeast-2.amazonaws.com/images/tabby.jpg"
img_path = "./tabby.jpg"
with urllib.request.urlopen(img_url) as response, open(img_path, "wb") as f:
    f.write(response.read())
img = read_image(img_path)
weights = ResNet50_Weights.DEFAULT
preprocess = weights.transforms()
batch = preprocess(img).unsqueeze(0)

# 컴파일된 모델 불러오기
module = rebel.Runtime("resnet50.rbln", tensor_type="pt")

# 추론
rebel_result = module(batch)

# 결과 확인
score, class_id = torch.topk(rebel_result, 1, dim=1)
category_name = weights.meta["categories"][class_id]
print("Top1 category: ", category_name)

`torch.compile()` API¶

RBLN SDK는 네이티브 API뿐만 아니라 PyTorch의 torch.compile 기능도 지원합니다. 이 통합 기능을 통해 개발자는 PyTorch의 Just-In-Time(JIT) 컴파일 기능을 활용하여 RBLN SDK 내에서 최적화된 모델 실행을 할 수 있습니다. torch.compile을 사용하는 워크플로우에 RBLN의 커스텀 백엔드를 통합하면, RBLN의 기본 기능과 완벽하게 호환되면서도 성능을 향상시킬 수 있습니다.

모델 준비¶

torch.compile을 위해 모델을 준비하는 과정은 RBLN의 기본 API를 사용하는 것과 동일합니다. 이 예제에서는 TorchVision 라이브러리의 ResNet50 모델을 사용합니다.

먼저, 필요한 라이브러리를 가져오고, 사전 학습된 가중치를 사용하여 ResNet50 모델을 인스턴스화합니다.

import torch
from torchvision.models import resnet50, ResNet50_Weights
import rebel  # RBLN 컴파일러

if torch.__version__ >= "2.5.0":
    torch._dynamo.config.inline_inbuilt_nn_modules = False

# TorchVision에서 사전 학습된 ResNet50 모델 로드
weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=weights)
model.eval()

입력 준비¶

다음으로, 입력 데이터를 준비해야 합니다. 이 단계도 기본 API를 사용하는 것과 동일합니다. torchvision.io.image.read_image()를 사용하여 이미지를 로드하고, ResNet50 모델에 대한 기본 전처리 변환을 적용합니다.

import torch
from torchvision.io.image import read_image
import urllib.request

# 샘플 이미지 다운로드
img_url = "https://rbln-public.s3.ap-northeast-2.amazonaws.com/images/tabby.jpg"
img_path = "./tabby.jpg"
with urllib.request.urlopen(img_url) as response, open(img_path, "wb") as f:
    f.write(response.read())

# 이미지 로드 및 전처리
img = read_image(img_path)
weights = ResNet50_Weights.DEFAULT
preprocess = weights.transforms()
batch = preprocess(img).unsqueeze(0)

모델 컴파일 및 실행¶

모델과 입력이 준비되었으면, torch.compile()을 사용하여 모델을 컴파일하고 실행할 수 있습니다. RBLN의 기본 API 워크플로우와 달리, torch.compile()은 JIT 컴파일러이므로 첫 번째 추론에서 런타임 시 컴파일이 이루어집니다. 그러나 여전히 RBLN 백엔드의 options을 사용하여 캐싱과 같은 컴파일 기능을 사용할 수 있습니다.

# RBLN 백엔드를 사용하여 모델 컴파일
model = torch.compile(model, 
                      backend="rbln",  # 타겟 백엔드를 'rbln'으로 설정
                      options={"cache_dir": "PATH/TO/rbln_cache_dir"},  # 컴파일된 결과를 캐시할 디렉토리 지정
                      dynamic=False)  # 동적 모양 지원 비활성화 (RBLN 백엔드는 현재 지원하지 않음)

# 모델 실행 (첫 번째 추론이 JIT 컴파일을 트리거함)
rbln_result = model(batch)

# 결과 출력
class_idx = torch.argmax(rbln_result).item()
print("Top-1 분류 인덱스: ", class_idx)  # 예상 출력: 281, "tabby, tabby cat"에 해당

`torch.compile()` 파라미터 이해하기¶

backend="rbln":

설명: 모델 컴파일에 사용할 백엔드를 지정합니다.
목적: 이를 "rbln"으로 설정하면 RBLN SDK의 커스텀 백엔드를 활용하여, RBLN 환경 내에서 성능이 최적화된 컴파일 프로세스를 사용할 수 있습니다.

options={"cache_dir": "PATH/TO/rbln_cache_dir", "npu" : "TARGET_NPU_DEVICE"}:

설명: 컴파일 프로세스에 추가 옵션을 제공합니다.
목적:
- cache_dir : 컴파일된 결과를 저장할 디렉토리를 지정합니다.
  - 사용법: 네이티브 API에서 compiled_model.save("resnet50.rbln")을 사용하는 것과 유사하게, 지정된 경로에 RBLN 아티팩트를 생성합니다.
  - 캐싱: 지정된 디렉토리에 이미 컴파일된 모델이 존재하는 경우, RBLN 백엔드는 모델을 다시 컴파일하지 않고 캐시된 버전을 사용합니다. 이는 모델을 재사용할 때 컴파일 시간과 오버헤드를 줄이는 데 도움이 됩니다.
- npu : 타겟 NPU 장치를 지정합니다. 장치 식별자를 지정하는 방법에 대한 자세한 내용은 네이티브 API 문서의 npu 옵션을 참조하십시오.

dynamic=False:

설명: 모델이 동적 입력 모양을 지원할지 여부를 나타냅니다.
목적:
- dynamic을 False로 설정하는 것이 권장되며, RBLN 백엔드는 현재 동적 모양을 지원하지 않습니다.
- 동작: 이 옵션을 False로 설정하면 모델은 고정된 입력 모양을 가정하며, 다른 모양의 입력은 재컴파일을 트리거합니다. 이를 통해 컴파일이 특정 입력 모양에 대해 최적화되지만, 입력 모양이 변경될 경우 다시 컴파일해야 할 수 있습니다.

요약¶

다음은 TorchVision ResNet50 모델을 위한 전체입니다:

import argparse
import urllib.request
import rebel  # noqa: F401  # torch dynamo의 "rbln" 백엔드를 사용하기 위해 필요
import torch
import torchvision
from torchvision.io.image import read_image

if torch.__version__ >= "2.5.0":
    torch._dynamo.config.inline_inbuilt_nn_modules = False

def parsing_argument():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--model_name",
        type=str,
        default="resnet50",
        help="(str) TorchVision 모델 이름의 유형.",
    )
    return parser.parse_args()

def main():
    args = parsing_argument()
    model_name = args.model_name

    # TorchVision 모델 인스턴스화
    weights = torchvision.models.get_model_weights(model_name).DEFAULT
    model = getattr(torchvision.models, model_name)(weights=weights).eval()

    # 입력 이미지 준비
    img_url = "https://rbln-public.s3.ap-northeast-2.amazonaws.com/images/tabby.jpg"
    img_path = "./tabby.jpg"
    with urllib.request.urlopen(img_url) as response, open(img_path, "wb") as f:
        f.write(response.read())
    img = read_image(img_path)
    preprocess = weights.transforms()
    batch = preprocess(img).unsqueeze(0)

    # 모델 컴파일
    model = torch.compile(model, backend="rbln", options={"cache_dir": "./rbln_cache_dir/"}, dynamic=False)

    # 첫 추론이 컴파일을 트리거함
    model(batch)

    # 이후, RBLN 하드웨어에 컴파일된 모델을 사용할 수 있음
    rbln_result = model(batch)

    # 결과 출력
    score, class_id = torch.topk(rbln_result, 1, dim=1)
    category_name = weights.meta["categories"][class_id]
    print("Top-1 카테고리: ", category_name)

if __name__ == "__main__":
    main()

파이토치 토치비전 ResNet50¶

사전 준비¶

네이티브 RBLN API¶

1단계. 컴파일 방법¶

모델 준비¶

모델 컴파일¶

컴파일된 모델 저장¶

2단계. 추론 방법¶

입력 데이터 준비¶

추론¶

결과 확인¶

요약¶

torch.compile() API¶

모델 준비¶

입력 준비¶

모델 컴파일 및 실행¶

torch.compile() 파라미터 이해하기¶

요약¶

파이토치 토치비전 `ResNet50`¶

`torch.compile()` API¶

`torch.compile()` 파라미터 이해하기¶