CVE-2025-46722
vLLM has a Weakness in MultiModalHasher Image Hashing Implementation
In short
vLLM's image hashing system doesn't include image dimensions when creating hashes, so two different-sized images with the same pixels can be mistaken for identical. This causes cache errors and potential data leakage.
Technical detail
The MultiModalHasher in vllm/multimodal/hasher.py uses only PIL.Image.tobytes() for hashing, omitting metadata like width, height, and mode, enabling hash collisions between images of different dimensions. This results in incorrect cache lookups (CWE-1288: Weak Hashing), potentially leading to unauthorized access to cached model outputs or data leakage (CWE-1023: Comparison Using Wrong Factors).
Summary generated and translated by AI from the official description.
vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image’s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0.
CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L
Affected products
vllm-project · vllmWant to know if your infrastructure is exposed to this?
Talk to TrueHacking →