13

I am new to python and working on a map reduce problem with mincemeat. I am getting the following error while running the mincemeat script.

$python mincemeat.py -p changeme localhost
error: uncaptured python exception, closing channel <__main__.Client connected at 0x923fdcc> 
(<type 'exceptions.NameError'>:global name 're' is not defined
 [/usr/lib/python2.7/asyncore.py|read|79]
 [/usr/lib/python2.7/asyncore.py|handle_read_event|438] 
 [/usr/lib/python2.7/asynchat.py|handle_read|140]
 [mincemeat.py|found_terminator|96]
 [mincemeat.py|process_command|194]
 [mincemeat.py|call_mapfn|170]
 [raw1.py|mapfn|43])

My code rests in raw1.py script which is given in the above stacktrace as [raw1.py|mapfn|43].

import re
import mincemeat

# ...

allStopWords = {'about':1, 'above':1, 'after':1, 'again':1}

def mapfn(fname, fcont):
    # ...
    for item in tList[1].split():
        word = re.sub(r'[^\w]', ' ', item).lower().strip()        # ERROR
        if (word not in allStopWords) and (len(word) > 1):
            # ....

I have already imported re in raw1.py. The error doesn't appear if I import re in mincemeat.py.

senshin
  • 10,022
  • 7
  • 46
  • 59
Satyajit Singh
  • 191
  • 1
  • 2
  • 6

3 Answers3

13

You need to have the import statement in mapfn itself. mapfn gets executed in a different python process, so it doesn't have access to the original context (including imports) in which it was declared.

Michael Fairley
  • 12,980
  • 4
  • 26
  • 23
  • Thanks for mincemeat! It is a great tool. I was wondering since this question may be a common occurrence, you could mention this in your github wiki for mincemeat? – RAbraham Oct 15 '12 at 14:21
4

"Global" variables in python are actually scoped to the module/file they're bound in; you do need to import them in every file that uses them.

A module name is just a variable like anything else.

Wooble
  • 87,717
  • 12
  • 108
  • 131
  • 1
    You can see that Satyajit _does_ import `re` in the same file where it's used. Do to the way mincemeat works though, mapfn ends up executing in a context where it doesn't have access to the original imports. – Michael Fairley Oct 06 '12 at 02:48
  • @MichaelFairley: mincemeat.py is a separate file with no import. – Wooble Oct 07 '12 at 00:04
  • 1
    mincemeat.py is a library that is being used that has no dependency on `re`. However, even though `mapfn` is defined in raw1.py, it ends up getting executed inside of a different python process in the context of mincemeat.py. Rather than modifying the library itself, the `import` can (and should) be added to `mapfn`. – Michael Fairley Oct 07 '12 at 05:34
0

It sounds like you've already answered this question. If you use re in mincemeat.py, you'll need to import re there as well.

damzam
  • 1,921
  • 15
  • 18