Object Detection¶

Overview¶

In this tutorial, we will learn how to run inference with a PyTorch YOLOv8 model using the RBLN SDK C/C++ Runtime API. The model is compiled using the RBLN SDK Python API, and the resulting *.rbln file is used for inference using the RBLN SDK C/C++ Runtime API. This approach combines the ease of model preparation in Python with the performance benefits of C/C++ for inference. The entire code used in this tutorial can be found in RBLN Model Zoo.

Setup & Installation¶

Before you begin, ensure that your system environment is properly configured and that all required packages are installed. This includes:

System Requirements:
- Python: 3.9–3.12
- RBLN Driver
Packages Requirements:
- torch
- ultralytics
- RBLN Compiler
- cmake >= 3.26.0
- RBLN SDK C/C++ Runtime API

Installation Command:

pip install torch ultralytics cmake  
pip install --extra-index-url https://pypi.rbln.ai/simple/ "rebel-compiler>=0.10.1"

Note

Please note that rebel-compiler requires an RBLN Portal account.

Compilation with `RBLN Python API`¶

While the RBLN Python API offers comprehensive functionality, capable of handling both compilation and inference processes within the RBLN SDK, the RBLN SDK C/C++ Runtime API is specifically designed and optimized for inference operations only. We will use the RBLN Python API for model compilation and RBLN SDK C/C++ Runtime API for performing inference.

Prepare the model¶

Import the YOLOv8m model from the ultralytics library and perform a forward pass to prepare the model.

from ultralytics import YOLO  
import rebel  
import torch  

model_name = "yolov8m"  
yolo = YOLO(f"{model_name}.pt")  
model = yolo.model.eval()  
model(torch.zeros(1, 3, 640, 640))  

Compile the model¶

Once a torch model (torch.nn.Module) is prepared, compile it with rebel.compile_from_torch(). Save the compiled model using compiled_model.save().

# Compile the model  
compiled_model = rebel.compile_from_torch(model, [ ("x", [1, 3, 640, 640], "float32") ])  

# Save the compiled model to local storage  
compiled_model.save(f"{model_name}.rbln")  

Complete compilation¶

The compilation code is included in compile.py. Execute the script as follows to generate the rbln file:

$ python compile.py --model-name=yolov8m  

Inference with `RBLN SDK C/C++ Runtime API`¶

Use the model for inference with the RBLN SDK C/C++ Runtime API to load the compiled model, run inference, and check output results.

Prepare CMake build script¶

The example application uses OpenCV for image pre/post-processing and argparse for CLI parameter parsing. The CMake script below describes external package dependencies and linking.

# Define dependencies for external Package  
include(FetchContent)  
include(cmake/opencv.cmake)  
include(cmake/argparse.cmake)  

# Define the name of executable  
add_executable(object_detection main.cc)  

# Update link info for package dependencies: OpenCV  
find_package(OpenCV CONFIG REQUIRED)  
target_link_libraries(object_detection ${OpenCV_LIBS})  

# Update link info for dependencies: RBLN  
find_package(rbln CONFIG REQUIRED)  
target_link_libraries(object_detection rbln::rbln_runtime)  

# Update including dependencies: argparse  
target_include_directories(object_detection PRIVATE ${argparse_INCLUDE_DIRS})  

Prepare the input¶

Preprocess the input image using OpenCV APIs. The following code snippet reads and preprocesses the image.

std::string input_path = "${SAMPLE_PATH}/people4.jpg";  
cv::Mat image;  
try {  
    image = cv::imread(input_path);  
} catch (const cv::Exception &err) {  
    std::cerr << err.what() << std::endl;  
    std::exit(1);  
}  
cv::Mat blob = cv::dnn::blobFromImage(GetSquareImage(image, 640), 1./255., cv::Size(), cv::Scalar(), true, false, CV_32F);  

Run inference (Synchronous Execution)¶

The following code snippet shows synchronous inference.

std::string model_path = "${SAMPLE_PATH}/yolov8m.rbln";  
RBLNModel *mod = rbln_create_model(model_path.c_str());  
RBLNRuntime *rt = rbln_create_runtime(mod, "default", 0, 0);  

// Set input data  
rbln_set_input(rt, 0, blob.data);  

// Run sync inference  
rbln_run(rt);  

// Get output results  
void *data = rbln_get_output(rt, 0);  

Run inference (Asynchronous Execution)¶

The following code snippet shows asynchronous inference.

std::string model_path = "${SAMPLE_PATH}/yolov8m.rbln";  
RBLNModel *mod = rbln_create_model(model_path.c_str());  
RBLNRuntime *rt = rbln_create_async_runtime(mod, "default", 0, 0);  

// Input Buffer Memory Mapping for Async execution
std::vector<void*> inputs = {blob.data};

// Output Buffer Memory Mapping for Async execution
auto n_out = rbln_get_num_outputs(rt);
std::vector<cv::Mat> outputs(n_out);
std::vector<char *> output_ptrs(n_out);
for (auto idx = 0; idx < n_out; idx++) {
    const RBLNTensorLayout *layout = rbln_get_output_layout(rt, idx);
    outputs[idx] = cv::Mat{layout->ndim, layout->shape, CV_32F};
    output_ptrs[idx] = reinterpret_cast<char *>(outputs[idx].data);
}

// Run async inference  
int rid = rbln_async_run(rt, inputs.data(), output_ptrs.data());

// Wait inference done  
rbln_async_wait(rt, rid, 1000);  

Post Processing¶

Process the output data (float32 array of shape (1,84,8400)) to perform NMS and draw bounding boxes. The following code snippet outlines the processing steps:

// Postprocessing for NMS  
const RBLNTensorLayout *layout = rbln_get_output_layout(rt, 0);  
cv::Mat logits{layout->ndim, layout->shape, CV_32F};  
memcpy(logits.data, data, rbln_get_layout_nbytes(layout));  

std::vector<cv::Rect> nms_boxes;  
std::vector<float> nms_confidences;  
std::vector<size_t> nms_class_ids;  
for (size_t i = 0; i < layout->shape[2]; i++) {  
    auto cx = logits.at<float>(0, 0, i);  
    auto cy = logits.at<float>(0, 1, i);  
    auto w = logits.at<float>(0, 2, i);  
    auto h = logits.at<float>(0, 3, i);  
    auto x = cx - w / 2;  
    auto y = cy - h / 2;  
    cv::Rect rect{static_cast<int>(x), static_cast<int>(y), static_cast<int>(w), static_cast<int>(h)};  
    float confidence = std::numeric_limits<float>::min();  
    int cls_id;  
    for (size_t j = 4; j < layout->shape[1]; j++) {  
        if (confidence < logits.at<float>(0, j, i)) {  
            confidence = logits.at<float>(0, j, i);  
            cls_id = j - 4;  
        }  
    }  
    nms_boxes.push_back(rect);  
    nms_confidences.push_back(confidence);  
    nms_class_ids.push_back(cls_id);  
}  
std::vector<int> nms_indices;  
cv::dnn::NMSBoxes(nms_boxes, nms_confidences, 0.25f, 0.45f, nms_indices);  
cv::Mat output_img = image.clone();  
for (size_t i = 0; i < nms_indices.size(); i++) {  
    auto idx = nms_indices[i];  
    auto class_id = nms_class_ids[idx];  
    auto scaled_box = ScaleBox(nms_boxes[idx], output_img.size(), 640);  
    cv::rectangle(output_img, scaled_box, cv::Scalar(255, 0, 0));  
    std::stringstream ss;  
    ss << COCO_CATEGORIES[class_id] << ": " << nms_confidences[idx];  
    cv::putText(output_img, ss.str(), scaled_box.tl() - cv::Point(0, 1), cv::FONT_HERSHEY_DUPLEX, 1, cv::Scalar(255, 0, 0));  
}  
cv::imwrite("result.jpg", output_img);  

Release resources¶

Release the runtime and model.

rbln_destroy_runtime(rt);  
rbln_destroy_model(mod);  

How to build using CMake¶

The code snippets are included in the RBLN Model Zoo C++ examples. To compile the code and create the executable binary, run the following commands:

$ mkdir ${SAMPLE_PATH}/build  
$ cd ${SAMPLE_PATH}/build  
$ cmake ..  
$ make  

How to run Executable file¶

After completing all steps, you can find the executable in the build directory.

# Synchronous execution  
$ ${SAMPLE_PATH}/build/object_detection -i ${SAMPLE_PATH}/people4.jpg  -m ${SAMPLE_PATH}/yolov8m.rbln  

# Asynchronous execution  
$ ${SAMPLE_PATH}/build/object_detection_async -i ${SAMPLE_PATH}/people4.jpg  -m ${SAMPLE_PATH}/yolov8m.rbln  

Example Output:

References¶

RBLN Model Zoo