TroubleShoot
How to generate core dump file¶
If you encounter a problem while running PyTorch RBLN, please send the generated core dump file to client_support@rebellions.ai. To create a core dump file, you first need to remove the ulimit restrictions by the following command.
Verify that the ulimit restrictions have been removed by running:
Re-run the problematic model script. When the error message occurs, a core dump file will be created under /var/crash
.
Logging Operators Running on CPU¶
When a PyTorch operator or a specific data type is encountered that is not yet supported by PyTorch RBLN, the operation is executed on the CPU instead to ensure seamless execution.
While this feature enhances model compatibility, these operations do not leverage the performance benefits of the NPU. Therefore, it is crucial to identify which operations are falling back to the CPU during the optimization process.
By default, the PyTorch RBLN log level is set to WARNING
, so debugging (DEBUG
) messages are not displayed. Therefore, to identify all operators running on the CPU for NPU performance optimization, you must explicitly set the environment variable to DEBUG
as shown below.
Usage:
To set it back to the default value, set the environment variable as follows.
Example Output:
With this environment, running a model in Eager Mode will print a log, as shown below, whenever an operation is performed on the CPU instead of the Rebellions NPU, containing the operator's name and (if traceable) the source code location.