The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. #33498

asmith26 · 2024-09-15T13:15:01Z

System Info

transformers version: 4.44.2
Platform: Linux-6.8.0-44-generic-x86_64-with-glibc2.39
Python version: 3.12.3
Huggingface_hub version: 0.24.7
Safetensors version: 0.4.5
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.4.1+cu121 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: No

Who can help?

speech models: @ylacombe, @eustlb
pipelines: @Rocketknight1

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

import torch 
from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model="openai/whisper-base.en",
    device="cpu",
    torch_dtype=torch.float32,
)

# https://github.com/openai/whisper/blob/main/tests/jfk.flac
pipe("./jfk.flac")

Expected behavior

This does return the expected:

{'text': ' And so my fellow Americans ask not what your country can do for you, ask what you can do for your country.'}

But it also prints the following, so would be nice to fix/suppress:

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.

Thanks!

The text was updated successfully, but these errors were encountered:

asmith26 · 2024-09-15T13:18:15Z

Related: openai/whisper#2335

Rocketknight1 · 2024-09-16T13:26:25Z

@asmith26 thanks for the issue! I've reproduced it here, will open a PR to fix in a sec.

ritwikmishra · 2024-10-27T06:31:14Z

I observed this when I was finetuning a LLM with ppo trainer. To resolve this warning I passed the attention mask as a named parameter to the generate function following this.

outputs = model.generate(
  inputs['input_ids'], 
  attention_mask=attention_mask,
  pad_token_id=tokenizer.eos_token_id
)

But then I observed an error which stated, "IndexError: too many indices for tensor of dimension 1" on the line of

lib/python3.9/site-packages/transformers/models/gemma/modeling_gemma.py
position_ids_expanded = position_ids[:, None, :].float() # let us call this line_e

I turned off the attention mask and using print statements before that line_e I inspected what is the ideal behavior of this line_e. The original warning was coming but i ignored it. I saw that position ids are being fed one by one. So to resolve this error I just unsqueezed the attention mask.

outputs = model.generate(
  inputs['input_ids'], 
  attention_mask=attention_mask.unsqueeze(0),
  pad_token_id=tokenizer.eos_token_id
)

and it worked fine.

asmith26 · 2024-11-04T23:29:37Z

Thanks for your help with this @Rocketknight1. Just thought I'd mention I still seem to be getting the same warning (I'm currently running transformers == 4.47.0.dev0).

Thanks again!

Rocketknight1 · 2024-11-05T14:07:30Z

@asmith26 I'm not getting that warning when I run the code sample above anymore. Did you change anything about it?

asmith26 · 2024-11-05T19:38:30Z

Interesting, thanks for the info @Rocketknight1

I've determined that if I add a chunk_length_s=30 (i.e. outputs = pipe("./jfk.flac", chunk_length_s=30) following this tutorial), I get The attention mask is not set and....

Happy to remove this argument for my need. Thanks again! :)

Rocketknight1 · 2024-11-09T13:33:41Z

That's still potentially an issue we should address, though! Even though you've found a fix, I'll reopen to make sure we don't lose track

lolbus · 2025-01-11T17:29:27Z

I observed this when I was finetuning a LLM with ppo trainer. To resolve this warning I passed the attention mask as a named parameter to the generate function following this.
outputs = model.generate(
  inputs['input_ids'], 
  attention_mask=attention_mask,
  pad_token_id=tokenizer.eos_token_id
)
But then I observed an error which stated, "IndexError: too many indices for tensor of dimension 1" on the line of
lib/python3.9/site-packages/transformers/models/gemma/modeling_gemma.py
position_ids_expanded = position_ids[:, None, :].float() # let us call this line_e
I turned off the attention mask and using print statements before that line_e I inspected what is the ideal behavior of this line_e. The original warning was coming but i ignored it. I saw that position ids are being fed one by one. So to resolve this error I just unsqueezed the attention mask.
outputs = model.generate(
  inputs['input_ids'], 
  attention_mask=attention_mask.unsqueeze(0),
  pad_token_id=tokenizer.eos_token_id
)
and it worked fine.

This is for LLM, for ASR, I dont think we declare the same tokenizer as yours as the tokenizer are well defined within the ASR such as whisper.

asmith26 added the bug label Sep 15, 2024

Rocketknight1 mentioned this issue Sep 16, 2024

Return attention mask in ASR pipeline to avoid warnings #33509

Merged

Rocketknight1 closed this as completed in #33509 Sep 18, 2024

Rocketknight1 reopened this Nov 9, 2024

mezbaul-h mentioned this issue Nov 16, 2024

Two errors mezbaul-h/june#15

Closed

github-actions bot closed this as completed Dec 12, 2024

eustlb reopened this Dec 19, 2024

github-actions bot closed this as completed Dec 28, 2024

eustlb reopened this Jan 15, 2025

huggingface deleted a comment from github-actions bot Jan 15, 2025

eustlb linked a pull request Jan 17, 2025 that will close this issue

Pipeline: fix unnecessary warnings #35753

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. #33498

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. #33498

asmith26 commented Sep 15, 2024 •

edited

Loading

asmith26 commented Sep 15, 2024

Rocketknight1 commented Sep 16, 2024

ritwikmishra commented Oct 27, 2024

asmith26 commented Nov 4, 2024

Rocketknight1 commented Nov 5, 2024

asmith26 commented Nov 5, 2024

Rocketknight1 commented Nov 9, 2024

lolbus commented Jan 11, 2025

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. #33498

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. #33498

Comments

asmith26 commented Sep 15, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

asmith26 commented Sep 15, 2024

Rocketknight1 commented Sep 16, 2024

ritwikmishra commented Oct 27, 2024

asmith26 commented Nov 4, 2024

Rocketknight1 commented Nov 5, 2024

asmith26 commented Nov 5, 2024

Rocketknight1 commented Nov 9, 2024

lolbus commented Jan 11, 2025

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. #33498

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. #33498

asmith26 commented Sep 15, 2024 •

edited

Loading