1

I'm fairly new to Python, and am trying to put together a Markov chain generator. The bit that's giving me problems is focused on adding each word in a list to a dictionary, associated with the word immediately following.

def trainMarkovChain():
    """Trains the Markov chain on the list of words, returning a dictionary."""
    words = wordList()
    Markov_dict = dict()
    for i in words:
        if i in Markov_dict:
            Markov_dict[i].append(words.index(i+1))
        else:
            Markov_dict[i] = [words.index(i+1)]
    print Markov_dict

wordList() is a previous function that turns a text file into a list of words. Just what it sounds like. I'm getting an error saying that I can't concatenate strings and integers, referring to words.index(i+1), but if that's not how to refer to the next item then how is it done?

stafford
  • 19
  • 2
  • 1
    Use `enumerate()` to get both index as well as item. `list.index` won't work as expected if your list contains duplicate items. – Ashwini Chaudhary May 01 '14 at 10:47
  • 1
    possible duplicate of [Iterate a list as pair (current, next) in Python](http://stackoverflow.com/questions/5434891/iterate-a-list-as-pair-current-next-in-python) – Ashwini Chaudhary May 01 '14 at 10:50
  • `words.index(i) + 1` is what you want, but this fails if there are duplicate words. – Jasper May 01 '14 at 11:04

4 Answers4

2

The following code, simplified a bit, should produce what you require. I'll elaborate more if something needs explaining.

words = 'Trains the Markov chain on the list of words, returning a dictionary'.split()
chain = {}
for i, word in enumerate(words):
    # ensure there's a record
    next_words = chain.setdefault(word, [])
    # break on the last word
    if i + 1 == len(words):
        break
    # append the next word
    next_words.append(words[i + 1])

print(words)
print(chain)

assert len(chain) == 11
assert chain['the'] == ['Markov', 'list']
assert chain['dictionary'] == []
famousgarkin
  • 13,687
  • 5
  • 58
  • 74
2

You can also do it as:

for a,b in zip(words, words[1:]):

This will assign a as an element in the list and b as the next element.

sshashank124
  • 31,495
  • 9
  • 67
  • 76
  • Good approach, but `zip(words, words[1:])` doesn't zip in the last word, as `words[1:]` is one element shorter. – famousgarkin May 01 '14 at 11:43
  • @famousgarkin, But isn't that what the OP wants since they are checking the next element and so they should stop at the second last so that it doesn't raise an error. – sshashank124 May 01 '14 at 12:12
  • Yep, may not matter, just pointing out in case someone wonders. And +1 for simplicity. – famousgarkin May 01 '14 at 16:11
0
def markov_chain(list):
    markov = {}
    for index, i in enumerate(list):
        if index<len(list)-1:
            markov[i]=list[index+1]

    return (markov)    

The code above takes a list as an input and returns the corresponding markov chain as a dictionary.

0

You can use loops to get that, but it's actually a waste to have to put the rest of your code in a loop when you only need the next element.

There are two nice options to avoid this:

Option 1 - if you know the next index, just call it:

my_list[my_index]

Although most of the times you won't know the index, but still you might want to avoid the for loop.


Option 2 - use iterators

& check this tutorial

my_iterator = iter(my_list)
next(my_iterator)    # no loop required
Javi
  • 913
  • 2
  • 13
  • 15