0

I am trying to compute the BLEU score between two strings using NLTK as follows:

from nltk import bleu_score
reference = ['The moon is very bright']
hypothesis = ['Dee']
print('bleu_score.corpus_bleu(reference, hypothesis): {0}'.
      format(bleu_score.corpus_bleu(reference, hypothesis)))

Running it causes the following error:

Traceback (most recent call last):
  File "C:\Users\Francky\Documents\GitHub\nlp\tests\SEbleu.py", line 28, in <module>
    format(bleu_score.corpus_bleu(reference, hypothesis)))
  File "C:\Anaconda\lib\site-packages\nltk\translate\bleu_score.py", line 146, in corpus_bleu
    p_i = modified_precision(references, hypothesis, i)
  File "C:\Anaconda\lib\site-packages\nltk\translate\bleu_score.py", line 287, in modified_precision
    return Fraction(numerator, denominator, _normalize=False)  
  File "C:\Anaconda\lib\site-packages\nltk\compat.py", line 700, in __new__
    cls = super(Fraction, cls).__new__(cls, numerator, denominator)
  File "C:\Anaconda\lib\fractions.py", line 162, in __new__
    raise ZeroDivisionError('Fraction(%s, 0)' % numerator)
ZeroDivisionError: Fraction(0, 0)

If I replace hypothesis = ['Dee'] with hypothesis = ['Deee'], the error message disappears. Why?

My system:

  • NLTK version: 3.2.1.
  • python 2.7.11 x64
Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501
  • 1
    I'm guessing a bug that has been fixed since version `3.5.1` because it works with that version for me. – Tadhg McDonald-Jensen Dec 17 '16 at 18:50
  • @TadhgMcDonald-Jensen Thank you. I guess NLTK's Anaconda package is outdated, as I had run `conda update nltk`. – Franck Dernoncourt Dec 17 '16 at 18:53
  • 1
    `pip install -U nltk` or `pip install -U https://github.com/alvations/nltk/archive/develop.zip`. Note that you are also using the function wrongly, check out the usage from the docstring within the code: https://github.com/alvations/nltk/blob/develop/nltk/translate/bleu_score.py#L82 – alvas Dec 18 '16 at 23:44
  • 1
    The input for the reference parameter is "list of references", i.e. list of lists of lists of strings, not just a list of string. The inner most strings are tokens, then the lists of strings makes a sentences, the list of list of string, makes multiple reference per hypothesis and the list of lists of list of strings, makes multiple instances of multiple reference per hypothesis. I hope that's clear enough, look at the docstring code. – alvas Dec 18 '16 at 23:47
  • 1
    BTW, regarding the old implementation of BLEU and why the 'dee' vs 'deee' is different, it's because how you're using the function, you are comparing characters and not works since string is a list of chars, it will treat each char as a word. And hence 'deee' can generate 4-grams which the old version of BLEU needs but 'dee' can't =) – alvas Dec 19 '16 at 00:57
  • @alvas Thanks! That explains :) You're welcome to convert your comment into an answer. Good to know regarding the proper use. I guess I'll stop using conda to manage the NLTK package so that I can get the latest version. – Franck Dernoncourt Dec 19 '16 at 01:00

0 Answers0