22

In HuggingFace, every time I call a pipeline() object, I get a warning:

`"Setting `pad_token_id` to `eos_token_id`:{eos_token_id} for open-end generation."

How do I suppress this warning without suppressing all logging warnings? I want other warnings, but I don't want this one.

dennlinger
  • 9,890
  • 1
  • 42
  • 63
Rylan Schaeffer
  • 1,945
  • 2
  • 28
  • 50

2 Answers2

35

The warning comes for any text generation task done by HuggingFace. This is explained here, and you can see the code here. You can avoid that warning by manually setting the pad_token_id to the eos_token_id.

That is when you call

model.generate(**encoded_input)

just change it to

model.generate(**encoded_input, pad_token_id=tokenizer.eos_token_id)

and that will get rid of the error. However, I haven't found a way to set this directly from the pipeline interface. I'm guessing you could pass in some arguments to the ArgumentHandler. But I haven't tried it.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Jacobo Azcona
  • 366
  • 4
  • 5
7

For a text-generation pipeline, you need to set the pad_token_id in the generator call to suppress the warning:

from transformers import pipeline

generator = pipeline('text-generation', model='gpt2')
sample = generator('test test', pad_token_id=generator.tokenizer.eos_token_id)
chicxulub
  • 210
  • 2
  • 5
  • What do you mean by "to suppress the output". @chicxulub – Yassin Sameh Jun 25 '23 at 11:51
  • I meant not having the generator print out the warning mentioned in the question: "Setting `pad_token_id` to `eos_token_id`:{eos_token_id} for open-end generation." I'll edit the answer to clarify that – chicxulub Jun 25 '23 at 21:49