past_key_value(s) name inconsistency causing problems #36290

HDCharles · 2025-02-19T22:03:42Z

System Info

transformers version: 4.50.0.dev0
Platform: Linux-6.4.3-0_fbk14_zion_2601_gcd42476b84e9-x86_64-with-glibc2.34
Python version: 3.12.9
Huggingface_hub version: 0.28.1
Safetensors version: 0.5.2
Accelerate version: 1.4.0
Accelerate config: not found
DeepSpeed version: not installed
PyTorch version (GPU?): 2.6.0.dev20241112+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: no
Using GPU in script?: yes
GPU type: NVIDIA H100

Who can help?

@ArthurZucker probably others

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

run https://huggingface.co/docs/transformers/main/en/quantization/torchao

Expected behavior

no error

this error is related to #36289

a bunch of models use past_key_value and past_key_values interchangeably, this causes issues since kwarg names are hardcoded to be skipped by the _skip_keys_device_placement attribute. This causes issues any time torch.compile is used with a model that has this issue.
The above PR fixes the issue for llama but other models like src/transformers/models/moonshine/modeling_moonshine.py
src/transformers/models/mistral/modeling_mistral.py
src/transformers/models/emu3/modeling_emu3.py
...etc, have the same issue which is actually breaking CI for that PR.

this is also the cause of pytorch/ao#1705 which is where this was first surfaced.

is there a reason for these two names to be used instead of just one? If not it seems like they should be entirely consolidated to avoid such issues, if so, then _skip_keys_device_placement needs to include both across all models or

The text was updated successfully, but these errors were encountered:

HDCharles added the bug label Feb 19, 2025

HDCharles changed the title ~~past_key_value name consistency~~ past_key_value(s) name inconsistency causing problems Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

past_key_value(s) name inconsistency causing problems #36290

past_key_value(s) name inconsistency causing problems #36290

HDCharles commented Feb 19, 2025 •

edited

Loading

past_key_value(s) name inconsistency causing problems #36290

past_key_value(s) name inconsistency causing problems #36290

Comments

HDCharles commented Feb 19, 2025 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

HDCharles commented Feb 19, 2025 •

edited

Loading