3

Here is the code:

import glob
import mincemeat
import re

text_files = glob.glob('finalcount/1/*')
def file_contents(file_name):
    f = open(file_name)
    try:
        return f.read()
    finally:
        f.close()

source = dict((file_name, file_contents(file_name))
          for file_name in text_files)

def mapfn(key, value):
    for line in value.splitlines():
        list1 = [ ]
        for temp in re.split('[\t]+',line):
            list1.append(temp)
        x = int(list1[1].strip());
        yield [list1[0],x]

def reducefn(key, value):
    return key, sum(value)

s = mincemeat.Server()
s.datasource = source
s.mapfn = mapfn
s.reducefn = reducefn

results = s.run_server(password="wola")
print results

This code is supposed to compute word counts of multiple files. But it keeps returning an error:

error: uncaptured python exception, closing channel <__main__.Client connected at 0x25c1990> 
(<type 'exceptions.ValueError'>:invalid literal for int() with base 10: '' 
 [C:\Python27\lib\asyncore.py|read|83] 
 [C:\Python27\lib\asyncore.py|handle_read_event|444] 
 [C:\Python27\lib\asynchat.py|handle_read|140] 
 [mincemeat.py|found_terminator|97] 
 [mincemeat.py|process_command|195] 
 [mincemeat.py|call_mapfn|171] 
 [projcount.py|mapfn|21])

The input files that I am working on look like this. Now I want to add the words and sum the number next to them in different files.

fawn    24
gai 1
nunnery 11
sowell  3
sonja   29
woods   591
clotted 1
spiders 84
hanging 522

After replacing re.split with line.split(), I got this error.

error: uncaptured python exception, closing channel <__main__.Client connected at 0x2531990> 
(<type 'exceptions.IndexError'>:list index out of range 
 [C:\Python27\lib\asyncore.py|read|83] 
 [C:\Python27\lib\asyncore.py|handle_read_event|444] 
 [C:\Python27\lib\asynchat.py|handle_read|140] 
 [mincemeat.py|found_terminator|97]
 [mincemeat.py|process_command|195] 
 [mincemeat.py|call_mapfn|171] 
 [projcount.py|mapfn|21]) 
senshin
  • 10,022
  • 7
  • 46
  • 59
amian
  • 187
  • 2
  • 2
  • 10
  • You aren't converting anything to a `float` in your script, only `int.` Make sure that `list[1].strip()` is actually a number. – Blender Jul 11 '13 at 02:54
  • Sorry i fortuitously printed some other error. I have now replaced it with the original error – amian Jul 11 '13 at 02:57
  • Are you sure all of your lines are actually tab-separated and not just separated by whitespace? If it's the latter, `line.split()` will take care of both. – Blender Jul 11 '13 at 02:59
  • I tried doing line.split(). I got the above error – amian Jul 11 '13 at 03:04
  • Is this the *exact* file that you're working on? – Blender Jul 11 '13 at 03:05

1 Answers1

1

I had got this error on a different occasion, i figured out that the problem comes when you are using python 3.3, i removed 3.3 and installed 2.7.5(http://python.org/download/) and it works fine now. :)

Meher
  • 11
  • 2
  • Different installations of Python should not be causing his problem. – RyPeck Sep 25 '13 at 12:34
  • He's downgraded version from 3.X to 2.X, and it _can_ be a solution as not all libraries work fine with both versions of Python – n1ckolas Sep 25 '13 at 12:36