3

I've downloaded the mincemeat.py with example from https://github.com/michaelfairley/mincemeatpy/zipball/v0.1.2

example.py as follows:

#!/usr/bin/env python
import mincemeat

    data = ["Humpty Dumpty sat on a wall",
            "Humpty Dumpty had a great fall",
            "All the King's horses and all the King's men",
            "Couldn't put Humpty together again",
           ]

datasource = dict(enumerate(data))

def mapfn(k, v):
    for w in v.split():
        yield w, 1

def reducefn(k, vs):
    result = sum(vs)
    return result

s = mincemeat.Server()
s.datasource = datasource
s.mapfn = mapfn
s.reducefn = reducefn

results = s.run_server(password="changeme")
print results

It is used for a word counting program.

I've connected two computers in network by LAN. I have used one computer as a server and run example.py on it; and on a second computer as client, I've run mincemeat.py with following command line statement:

python mincemeat.py -p changeme server-IP

It works fine.

Now I have connected 3 computers in a LAN by a router. Then one machine works as a server and I want to run the example.py on it, and run the remaining two machines as client machines.

I want to distribute the task to my two client machines. So what is the process to distribute the task of the map and reduce to the two computers? How can I distribute my task, defined in example.py, to two client computers with their unique IPs respectively?

senshin
  • 10,022
  • 7
  • 46
  • 59
priyank
  • 148
  • 3
  • 16

1 Answers1

2

The default example hardly contains 50 words. So, by the time you switch windows to start the second client, the first client has finished processing the text. Instead, run the same with a big text file and you can add a second client. The below should work. I used the plain text format of novel Ulyesses(~1.5 MB) from Project Gutenberg for this example.

In my machine(Intel Xeon@ 3.10 GHz), this took less than 30 seconds with 2 clients. So, use a bigger file or a list of files or be quick to start the second client.

#!/usr/bin/env python
import mincemeat

def file_contents(file_name):
    f = open(file_name)
    try:
        return f.read()
    finally:
        f.close()

novel_name = 'Ulysses.txt'

# The data source can be any dictionary-like object
datasource = {novel_name:file_contents(novel_name)}

def mapfn(k, v):
    for w in v.split():
        yield w, 1

def reducefn(k, vs):
    result = sum(vs)
    return result

s = mincemeat.Server()
s.datasource = datasource
s.mapfn = mapfn
s.reducefn = reducefn

results = s.run_server(password="changeme")
print results

For a directory of files, use the following example. Dump all the text files in the folder textfiles.

#!/usr/bin/env python
import mincemeat
import glob

all_files = glob.glob('textfiles/*.txt')

def file_contents(file_name):
    f = open(file_name)
    try:
        return f.read()
    finally:
        f.close()

# The data source can be any dictionary-like object
datasource = dict((file_name, file_contents(file_name))
                  for file_name in all_files)

def mapfn(k, v):
    for w in v.split():
        yield w, 1

def reducefn(k, vs):
    result = sum(vs)
    return result

s = mincemeat.Server()
s.datasource = datasource
s.mapfn = mapfn
s.reducefn = reducefn

results = s.run_server(password="changeme")
print results
Sundeep
  • 1,536
  • 5
  • 23
  • 35