0

I try to write a homework about map-reduce. I run in a terminal:

ioannis@ioannis-desktop:~$ python hw3.py

then in another terminal:

ioannis@ioannis-desktop:~$ ls
a2.py                  la.py~                     stopwords.py
active_output          LTP Crafting Quality Code  stopwords.pyc
Desktop                mincemeat.py               Templates
Documents              mincemeat.pyc              test.py
Downloads              Music                      test.py~
Dropbox                NetBeansProjects           test.pyc
examples.desktop       NotFor                     Ubuntu One
Firefox_wallpaper.png  Pictures                   Videos
hw3.py                 Public                     vmware
hw3.py~                __pycache__                Web Intelligence and Big Data
ioannis@ioannis-desktop:~$ python mincemeat.py -p changeme localhost
error: uncaptured python exception, closing channel <__main__.Client connected localhost:11235 at 0x27748c0> 
(<type 'exceptions.NameError'>:global name 'allStopWords' is not defined 
 [/usr/lib/python2.7/asyncore.py|read|83] 
 [/usr/lib/python2.7/asyncore.py|handle_read_event|449] 
 [/usr/lib/python2.7/asynchat.py|handle_read|140]
 [mincemeat.py|found_terminator|96] 
 [mincemeat.py|process_command|194]
 [mincemeat.py|call_mapfn|170]
 [hw3.py|mapfn|35])
ioannis@ioannis-desktop:~$ 

the hw3.py:

import mincemeat
import glob
from stopwords import allStopWords
text_files = glob.glob('/home/ioannis/Web Intelligence and Big Data/Week 3: Load - I/hw3data/hw3data/*')

def file_contents(file_name):
    f = open(file_name)
    try:     
        return f.read()
    except:
        print "exception!!!!!!"
    finally:
        f.close()

source = dict((file_name, file_contents(file_name))
    for file_name in text_files)

def mapfn(key, value):
    for line in value.splitlines():
            ........................
            ........................
            if word in allStopWords:
                continue        
            print(word)
        print(words_title)
        print("\n\n")

def reducefn(k, vs):
    result = sum(vs)
    return result

s = mincemeat.Server()
s.datasource = source
s.mapfn = mapfn
s.reducefn = reducefn

results = s.run_server(password="changeme")
print results

Why doesn't it work? As you can see both hw3.py and stopwords.py are in the home directory!

senshin
  • 10,022
  • 7
  • 46
  • 59

1 Answers1

0

One potential gotcha when using mincemeat.py: Your mapfn and reducefn functions don't have access to their enclosing environment, including imported modules. If you need to use an imported module in one of these functions, be sure to include import whatever in the functions themselves.

https://github.com/michaelfairley/mincemeatpy#imports

IOW: move the from stopwords import allStopWords statement at the top of your mapfn function.

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118