Skip to content

Object Detection

Overview

In this tutorial, we will learn how to run inference with a PyTorch YOLOv8 model using the RBLN SDK C/C++ Runtime API. The model is compiled using the RBLN SDK Python API, and the resulting *.rbln file is used for inference using the RBLN SDK C/C++ Runtime API. This approach combines the ease of model preparation in Python with the performance benefits of C/C++ for inference. The entire code used in this tutorial can be found in RBLN Model Zoo.

Setup & Installation

Before you begin, ensure that your system environment is properly configured and that all required packages are installed. This includes:

Note

Please note that rebel-compiler requires an RBLN Portal account.

Compilation with RBLN Python API

While the RBLN Python API offers comprehensive functionality, capable of handling both compilation and inference processes within the RBLN SDK, the RBLN SDK C/C++ Runtime API is specifically designed and optimized for inference operations only. We will use the RBLN Python API for model compilation and RBLN SDK C/C++ Runtime API for performing inference.

Prepare the model

Import the YOLOv8m model from the ultralytics library and perform a forward pass to prepare the model.

1
2
3
4
5
6
7
8
from ultralytics import YOLO  
import rebel  
import torch  

model_name = "yolov8m"  
yolo = YOLO(f"{model_name}.pt")  
model = yolo.model.eval()  
model(torch.zeros(1, 3, 640, 640))  

Compile the model

Once a torch model (torch.nn.Module) is prepared, compile it with rebel.compile_from_torch(). Save the compiled model using compiled_model.save().

1
2
3
4
5
# Compile the model  
compiled_model = rebel.compile_from_torch(model, [ ("x", [1, 3, 640, 640], "float32") ])  

# Save the compiled model to local storage  
compiled_model.save(f"{model_name}.rbln")  

Complete compilation

The compilation code is included in compile.py. Execute the script as follows to generate the rbln file:

$ python compile.py --model-name=yolov8m  

Inference with RBLN SDK C/C++ Runtime API

Use the model for inference with the RBLN SDK C/C++ Runtime API to load the compiled model, run inference, and check output results.

Prepare CMake build script

The example application uses OpenCV for image pre/post-processing and argparse for CLI parameter parsing. The CMake script below describes external package dependencies and linking.

# Define dependencies for external Package  
include(FetchContent)  
include(cmake/opencv.cmake)  
include(cmake/argparse.cmake)  

# Define the name of executable  
add_executable(object_detection main.cc)  

# Update link info for package dependencies: OpenCV  
find_package(OpenCV CONFIG REQUIRED)  
target_link_libraries(object_detection ${OpenCV_LIBS})  

# Update link info for dependencies: RBLN  
find_package(rbln CONFIG REQUIRED)  
target_link_libraries(object_detection rbln::rbln_runtime)  

# Update including dependencies: argparse  
target_include_directories(object_detection PRIVATE ${argparse_INCLUDE_DIRS})  

Prepare the input

Preprocess the input image using OpenCV APIs. The following code snippet reads and preprocesses the image.

1
2
3
4
5
6
7
8
9
std::string input_path = "${SAMPLE_PATH}/people4.jpg";  
cv::Mat image;  
try {  
    image = cv::imread(input_path);  
} catch (const cv::Exception &err) {  
    std::cerr << err.what() << std::endl;  
    std::exit(1);  
}  
cv::Mat blob = cv::dnn::blobFromImage(GetSquareImage(image, 640), 1./255., cv::Size(), cv::Scalar(), true, false, CV_32F);  

Run inference (Synchronous Execution)

The following code snippet shows synchronous inference.

std::string model_path = "${SAMPLE_PATH}/yolov8m.rbln";  
RBLNModel *mod = rbln_create_model(model_path.c_str());  
RBLNRuntime *rt = rbln_create_runtime(mod, "default", 0, 0);  

// Set input data  
rbln_set_input(rt, 0, blob.data);  

// Run sync inference  
rbln_run(rt);  

// Get output results  
void *data = rbln_get_output(rt, 0);  

Run inference (Asynchronous Execution)

The following code snippet shows asynchronous inference.

std::string model_path = "${SAMPLE_PATH}/yolov8m.rbln";  
RBLNModel *mod = rbln_create_model(model_path.c_str());  
RBLNRuntime *rt = rbln_create_async_runtime(mod, "default", 0, 0);  

// Input Buffer Memory Mapping for Async execution
std::vector<void*> inputs = {blob.data};

// Output Buffer Memory Mapping for Async execution
auto n_out = rbln_get_num_outputs(rt);
std::vector<cv::Mat> outputs(n_out);
std::vector<char *> output_ptrs(n_out);
for (auto idx = 0; idx < n_out; idx++) {
    const RBLNTensorLayout *layout = rbln_get_output_layout(rt, idx);
    outputs[idx] = cv::Mat{layout->ndim, layout->shape, CV_32F};
    output_ptrs[idx] = reinterpret_cast<char *>(outputs[idx].data);
}

// Run async inference  
int rid = rbln_async_run(rt, inputs.data(), output_ptrs.data());

// Wait inference done  
rbln_async_wait(rt, rid, 1000);  

Post Processing

Process the output data (float32 array of shape (1,84,8400)) to perform NMS and draw bounding boxes. The following code snippet outlines the processing steps:

// Postprocessing for NMS  
const RBLNTensorLayout *layout = rbln_get_output_layout(rt, 0);  
cv::Mat logits{layout->ndim, layout->shape, CV_32F};  
memcpy(logits.data, data, rbln_get_layout_nbytes(layout));  

std::vector<cv::Rect> nms_boxes;  
std::vector<float> nms_confidences;  
std::vector<size_t> nms_class_ids;  
for (size_t i = 0; i < layout->shape[2]; i++) {  
    auto cx = logits.at<float>(0, 0, i);  
    auto cy = logits.at<float>(0, 1, i);  
    auto w = logits.at<float>(0, 2, i);  
    auto h = logits.at<float>(0, 3, i);  
    auto x = cx - w / 2;  
    auto y = cy - h / 2;  
    cv::Rect rect{static_cast<int>(x), static_cast<int>(y), static_cast<int>(w), static_cast<int>(h)};  
    float confidence = std::numeric_limits<float>::min();  
    int cls_id;  
    for (size_t j = 4; j < layout->shape[1]; j++) {  
        if (confidence < logits.at<float>(0, j, i)) {  
            confidence = logits.at<float>(0, j, i);  
            cls_id = j - 4;  
        }  
    }  
    nms_boxes.push_back(rect);  
    nms_confidences.push_back(confidence);  
    nms_class_ids.push_back(cls_id);  
}  
std::vector<int> nms_indices;  
cv::dnn::NMSBoxes(nms_boxes, nms_confidences, 0.25f, 0.45f, nms_indices);  
cv::Mat output_img = image.clone();  
for (size_t i = 0; i < nms_indices.size(); i++) {  
    auto idx = nms_indices[i];  
    auto class_id = nms_class_ids[idx];  
    auto scaled_box = ScaleBox(nms_boxes[idx], output_img.size(), 640);  
    cv::rectangle(output_img, scaled_box, cv::Scalar(255, 0, 0));  
    std::stringstream ss;  
    ss << COCO_CATEGORIES[class_id] << ": " << nms_confidences[idx];  
    cv::putText(output_img, ss.str(), scaled_box.tl() - cv::Point(0, 1), cv::FONT_HERSHEY_DUPLEX, 1, cv::Scalar(255, 0, 0));  
}  
cv::imwrite("result.jpg", output_img);  

Release resources

Release the runtime and model.

rbln_destroy_runtime(rt);  
rbln_destroy_model(mod);  

How to build using CMake

The code snippets are included in the RBLN Model Zoo C++ examples. To compile the code and create the executable binary, run the following commands:

1
2
3
4
$ mkdir ${SAMPLE_PATH}/build  
$ cd ${SAMPLE_PATH}/build  
$ cmake ..  
$ make  

How to run Executable file

After completing all steps, you can find the executable in the build directory.

1
2
3
4
5
# Synchronous execution  
$ ${SAMPLE_PATH}/build/object_detection -i ${SAMPLE_PATH}/people4.jpg  -m ${SAMPLE_PATH}/yolov8m.rbln  

# Asynchronous execution  
$ ${SAMPLE_PATH}/build/object_detection_async -i ${SAMPLE_PATH}/people4.jpg  -m ${SAMPLE_PATH}/yolov8m.rbln  

Example Output: Image

References