You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
a bunch of models use past_key_value and past_key_values interchangeably, this causes issues since kwarg names are hardcoded to be skipped by the _skip_keys_device_placement attribute. This causes issues any time torch.compile is used with a model that has this issue.
The above PR fixes the issue for llama but other models like src/transformers/models/moonshine/modeling_moonshine.py
src/transformers/models/mistral/modeling_mistral.py
src/transformers/models/emu3/modeling_emu3.py
...etc, have the same issue which is actually breaking CI for that PR.
this is also the cause of pytorch/ao#1705 which is where this was first surfaced.
is there a reason for these two names to be used instead of just one? If not it seems like they should be entirely consolidated to avoid such issues, if so, then _skip_keys_device_placement needs to include both across all models or
The text was updated successfully, but these errors were encountered:
System Info
transformers
version: 4.50.0.dev0Who can help?
@ArthurZucker probably others
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
run https://huggingface.co/docs/transformers/main/en/quantization/torchao
Expected behavior
no error
this error is related to #36289
a bunch of models use past_key_value and past_key_values interchangeably, this causes issues since kwarg names are hardcoded to be skipped by the _skip_keys_device_placement attribute. This causes issues any time torch.compile is used with a model that has this issue.
The above PR fixes the issue for llama but other models like src/transformers/models/moonshine/modeling_moonshine.py
src/transformers/models/mistral/modeling_mistral.py
src/transformers/models/emu3/modeling_emu3.py
...etc, have the same issue which is actually breaking CI for that PR.
this is also the cause of pytorch/ao#1705 which is where this was first surfaced.
is there a reason for these two names to be used instead of just one? If not it seems like they should be entirely consolidated to avoid such issues, if so, then _skip_keys_device_placement needs to include both across all models or
The text was updated successfully, but these errors were encountered: