0

I am trying to get trigrams out of a sentence and save them in a dictionary, with their frequenz as value. I wrote this:

trigrams = {}
sentence = ["What", "is", "happening", "right", "now"]

for word in sentence:
      if word != sentence[-1] or sentence[-2] and tuple((word, sentence[sentence.index(word) +1], sentence[sentence.index(word) +2])) not in trigrams:
             trigrams.update({tuple((word, sentence[sentence.index(word) +1], sentence[sentence.index(word) +2])):1})

Should look like this: ("what","is","happening"):1 ("is","happening","right"):1 etc

But now I am keep getting an IndexError in the update-line.

spiderkitty
  • 31
  • 1
  • 1
  • 4

3 Answers3

0

You can use lists as your tuples' contents are all of the same datatype (string)

It's probably easier to do:

trigrams = []
sentence = ["What", "is", "happening", "right", "now"]

for i in range(2,len(sentence)):
    trigrams.append([sentence[i-2],sentence[i-1],sentence[i]])
Ben Stobbs
  • 422
  • 6
  • 14
  • Yes, that actually looks a lot easier but I need to test, if they are alredy in the dict. However, I found my mistake. Thank you for helping me! – spiderkitty Dec 18 '16 at 18:02
0

I guess if word != sentence[-1] or sentence[-2] is not what you want. Do you mean if word != sentence[-1] and word != sentence[-2], meaning word does not equal either sentence[-1] nor sentence[-2]?

jmd_dk
  • 12,125
  • 9
  • 63
  • 94
0

Given you would like to keep your code structure with the tuple and change minimally your code, you can do this (not saying this might be a good approach for your problem, etc.):

trigrams = {}
sentence = ["What", "is", "happening", "right", "now"]

for index, word in enumerate(sentence):
    print index, word  # to understand how the iteration goes on
    if index < len(sentence)-2:
        if tuple((word, sentence[index+1], sentence[index+2])) not in trigrams:
            trigrams.update({tuple((word, sentence[index+1], sentence[index+2])):1})

You were getting an index error because you were accessing an element that didn't exist in tuple()... because the way you were doing the checking to see if were near the end of the list (the last two elements) wasn't done right.

The code you were using:

if word != sentence[-1] or sentence[-2]

is not right and you were comparing strings eventually and not the indexes, which is what is important here! Compare the indexes, not the values at those positions.

fedepad
  • 4,509
  • 1
  • 13
  • 27