Difference in average perplexity for tokens "My name is" for the inputs "My name is" and "My name is Dennis"

Question

I want to calculate the perplexity for some prefix sequence of text as well as the perplexity for the generated answer, but I noticed some idiosyncrasies.

Calculating the perplexities for the sequence "My name is" differs between the input "My name is" and "My name is Dennis". Why?

Inputs = "My name is Dennis"

inputs = tokenizer("My name is Dennis", return_tensors="pt")
input_ids = inputs.input_ids

# Forward pass through the model
outputs = model(input_ids, labels=input_ids)
logits = outputs.logits

# Calculate the negative log likelihood loss
logits_flat = logits.view(-1, logits.shape[-1])
input_ids_flat = inputs.input_ids.view(-1)
nll_loss = F.cross_entropy(logits_flat, input_ids_flat, reduction='none')

# Reshape the loss tensor to match the shape of the input_ids tensor
nll_loss = nll_loss.view(inputs.input_ids.shape)

# Print the NLL loss tensor for prefix
prompt_length = len(tokenizer.encode("My name is"))
torch.exp(nll_loss[:, :prompt_length].mean()).item(), nll_loss[:, :prompt_length].shape

Out: (4.582338809967041, torch.Size([1, 4]))

Inputs = "My name is"

inputs = tokenizer("My name is", return_tensors="pt")
input_ids = inputs.input_ids

# Forward pass through the model
outputs = model(input_ids, labels=input_ids)
logits = outputs.logits

# Calculate the negative log likelihood loss
logits_flat = logits.view(-1, logits.shape[-1])
input_ids_flat = inputs.input_ids.view(-1)
nll_loss = F.cross_entropy(logits_flat, input_ids_flat, reduction='none')

# Reshape the loss tensor to match the shape of the input_ids tensor
nll_loss = nll_loss.view(inputs.input_ids.shape)

# Print the NLL loss tensor for all tokens
torch.exp(nll_loss.mean()), nll_loss.shape

Out: (tensor(20.6034, grad_fn=<ExpBackward0>), torch.Size([1, 4]))

Imports/etc.

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch.nn.functional as F

model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small")

I've detailed what I've done in the post.

Maybe this is easier https://stackoverflow.com/questions/75886674/how-to-compute-sentence-level-perplexity-from-hugging-face-language-models — alvas, Apr 30 '23 at 13:16

Difference in average perplexity for tokens "My name is" for the inputs "My name is" and "My name is Dennis"

Inputs = "My name is Dennis"

Inputs = "My name is"

Imports/etc.

0 Answers0