from nltk.translate.bleu_score import sentence_bleu
reference = [['this', 'is', 'ae', 'test']]
candidate = ['this', 'is', 'ad', 'test']
score = sentence_bleu(reference, candidate)
print(score)
I am using this code to calculate the BLEU score and the score I am getting is 1.0547686614863434e-154
. I wander why I am getting so small value even only one letter is different in candidate list.
score = sentence_bleu(reference, candidate,weights = [1])
I tried adding weight = [1] as a parameter and it gave me 0.75
as output. I cant understand why I have to add weight to get a reasonable result. Any help would be appreciated.
I thought its maybe because the sentence is not long enough so I added more words:
from nltk.translate.bleu_score import sentence_bleu
reference = [['this', 'is', 'ae', 'test','rest','pep','did']]
candidate = ['this', 'is', 'ad', 'test','rest','pep','did']
score = sentence_bleu(reference, candidate)
print(score)
Now I am getting 0.488923022434901
but still I think is too low value.