Jupyter string tokenization for python

Question

I'm trying to implement simple_tokenize using dictionary as the output from my previous code but i get an error message. Any assistance with the following code would be much appreciated. I'm using Python 2.7 Jupyter

import csv
reader = csv.reader(open('data.csv'))

dictionary = {}
for row in reader:
    key = row[0]
    dictionary[key] = row[1:]
print dictionary

The above works pretty well but issue is with the following:

import re

words = dictionary
split_regex = r'\W+'

def simple_tokenize(string):

    for i in rows:
        word = words.split
    #pass

print word

I get this error:

NameError                                 Traceback (most recent call last)
<ipython-input-2-0d0e05fb1556> in <module>()
      1 import re
      2 
----> 3 words = dictionary
      4 split_regex = r'\W+'
      5 

NameError: name 'dictionary' is not defined

score 0 · Answer 1 · edited May 23 '17 at 12:24

Variables are not saved between Jupyter sessions, unless you explicitly do so yourself. Thus, if you ran the first code section, then quit your Jupyter session, started a new Jupyter session and ran the second code block, dictionary is not preserved from the first session and will thus be undefined, as indicated by the error.

If you run the above code blocks differently (e.g., not across Jupyter sessions), you should indicate this, but the tags and traceback suggest this is what you do.

Jupyter string tokenization for python

1 Answers1