Troubleshooting¶
This page provides solutions to common problems, tips for quick fixes, and guidance to help you get back on track.
1. Failed to import the vllm
or vllm-rbln
package¶
Symptoms¶
- import vllm; print(vllm.__path__) is None;
- The
vllm-rbln
plugin is not registered in vllm, so UnspecifiedPlatform is initialized.
Causes¶
In versions prior to v0.8.1, the vllm-rbln
package included a modified copy of vllm
. As of v0.8.1, vllm-rbln
has built on the new plugin system. This change may cause installation conflicts, which could prevent vllm
from being installed properly.
Solution¶
If you were using vllm
earlier than version 0.9.1 or vllm-rbln
earlier than version 0.8.0, please uninstall both packages before reinstalling.
2. Model loading failed¶
Symptoms¶
- [rank0]: AttributeError: 'ModelConfig' object has no attribute 'compiled_model_dir'
- Cannot find
.rbln
files in model
Causes¶
For now, vllm-rbln
supports only pre-compiled models.
Support for online compilation using torch.compile
in vLLM will be added soon.
Solution¶
Before running inference with vLLM, you need to compile the model using optimum-rbln
and utilize the compiled model in vLLM.
3. TypeError: block_tables (shape=(a,)) has a shape different to required shape (b,).¶
Symptoms¶
- The shape of block_tables is different from the shape that vLLM requires.
Causes¶
The max_model_len
and block_size
values you set for vLLM's engine are incorrect.
Solution¶
You need to set the max_model_len
and block_size
to max_seq_len
and kvcache_partition_len
defined in the compiled model.
You can check max_seq_len
and kvcache_partition_len
in the rbln_config.json
file inside the compiled model directory.
4. KeyError: self.model.decoder = self.model.decoders[padded_batch_size]¶
Symptoms¶
- KeyError occurs in self.model.decoder = self.model.decoders[padded_batch_size]
Causes¶
The max_num_seqs
value you set for vLLM's engine is incorrect.
Solution¶
You need to set the max_num_seqs
as batch_size
defined in the compiled model.
You can check batch_size
in the rbln_config.json
file inside the compiled model directory.