Skip to content

Troubleshooting

This page provides solutions to common problems, tips for quick fixes, and guidance to help you get back on track.

1. Failed to import the vllm or vllm-rbln package

Symptoms

  • import vllm; print(vllm.__path__) is None;
  • The vllm-rbln plugin is not registered in vllm, so UnspecifiedPlatform is initialized.

Causes

In versions prior to v0.8.1, the vllm-rbln package included a modified copy of vllm. As of v0.8.1, vllm-rbln has built on the new plugin system. This change may cause installation conflicts, which could prevent vllm from being installed properly.

Solution

If you were using vllm earlier than version 0.9.1 or vllm-rbln earlier than version 0.8.0, please uninstall both packages before reinstalling.

2. Model loading failed

Symptoms

  • [rank0]: AttributeError: 'ModelConfig' object has no attribute 'compiled_model_dir'
  • Cannot find .rbln files in model

Causes

For now, vllm-rbln supports only pre-compiled models. Support for online compilation using torch.compile in vLLM will be added soon.

Solution

Before running inference with vLLM, you need to compile the model using optimum-rbln and utilize the compiled model in vLLM.

3. TypeError: block_tables (shape=(a,)) has a shape different to required shape (b,).

Symptoms

  • The shape of block_tables is different from the shape that vLLM requires.

Causes

The max_model_len and block_size values you set for vLLM's engine are incorrect.

Solution

You need to set the max_model_len and block_size to max_seq_len and kvcache_partition_len defined in the compiled model. You can check max_seq_len and kvcache_partition_len in the rbln_config.json file inside the compiled model directory.

4. KeyError: self.model.decoder = self.model.decoders[padded_batch_size]

Symptoms

  • KeyError occurs in self.model.decoder = self.model.decoders[padded_batch_size]

Causes

The max_num_seqs value you set for vLLM's engine is incorrect.

Solution

You need to set the max_num_seqs as batch_size defined in the compiled model. You can check batch_size in the rbln_config.json file inside the compiled model directory.