4

In python, I am using the mincemeat map-reduce framework

From my map function I would like to yield (k,v) in a loop, which would send the output to the reduce function (sample data given which is the output of my map function )

auth3 {'practical': 1, 'volume': 1, 'physics': 1} 
auth34 {'practical': 1, 'volume': 1, 'chemistry': 1}
....

There would be many such entries; this is just a few as an example.

Here, auth3 and auth34 are keys and the respective values are dictionary items

Inside the reduce function when I try to print the key,values, I am getting "too many values to unpack" error. My reduce function looks like this

def reducefn(k, v):     
    for k,val in (k,v):
        print k, v

Please let me know how to resolve this error.

senshin
  • 10,022
  • 7
  • 46
  • 59
user2339307
  • 41
  • 1
  • 1
  • 2

3 Answers3

1

First, define your dictionary with python built-in dict

>>> dic1 = dict(auth3 = {'practical': 1, 'volume': 1, 'physics': 1}, 
        auth34 = {'practical': 1, 'volume': 1, 'chemistry': 1} )
>>> dic1
{'auth3': {'practical': 1, 'volume': 1, 'physics': 1}, 
        'auth34': {'practical': 1, 'volume': 1, 'chemistry': 1}}

Then, your reduce function may go as

def reducefn(dictofdicts):     
    for key, value in dictofdicts.iteritems() :
        print key, value

In the end,

>>> reducefn(dic1)
auth3 {'practical': 1, 'volume': 1, 'physics': 1}
auth34 {'practical': 1, 'volume': 1, 'chemistry': 1}
kiriloff
  • 25,609
  • 37
  • 148
  • 229
0
def reducefn(*dicts): #collects multiple arguments and stores in dicts
    for dic in dicts: #go over each dictionary passed in
        for k,v in dic.items(): #go over key,value pairs in the dic
            print(k,v) 

reducefn({'practical': 1, 'volume': 1, 'physics': 1} ,{'practical': 1, 'volume': 1, 'chemistry': 1})

Produces

>>> 
physics 1
practical 1
volume 1
chemistry 1
practical 1
volume 1

Now, regarding your implementation:

def reducefn(k, v):

The function signature above takes two arguments. The arguments passed to the function are accessed via k and v respectively. So an invocation of reducefn({"key1":"value"},{"key2":"value"}) results in k being assigned {"key1":"value"} and v being assigned {"key2":"vlaue"}.

When you try to invoke it like so: reducefn(dic1,dic2,dic3,...) you are passing in more than the allowed number of parameters as defined by the declaration/signature of reducefn.

for k,val in (k,v):

Now, assuming you passed in two dictionaries to reducefn, both k and v would be dictionaries. The for loop above would be equivalent to:

>>> a = {"Name":"A"}
>>> b = {"Name":"B"}
>>> for (d1,d2) in (a,b):
    print(d1,d2)

Which gives the following error:

ValueError: need more than 1 value to unpack

This occurs because you're essentially doing this when the for loop is invoked:

d1,d2=a

You can see we get this error when we try that in a REPL

>>> d1,d2=a
Traceback (most recent call last):
  File "<pyshell#24>", line 1, in <module>
    d1,d2=a
ValueError: need more than 1 value to unpack

We could do this:

>>> for (d1,d2) in [(a,b)]:
    print(d1,d2)


{'Name': 'A'} {'Name': 'B'}

Which assigns the tuple (a,b) to d1,d2. This is called unpacking and would look like this:

d1,d2 = (a,b)

However, in our for loop for k,val in (k,v): it wouldn't make sense as we would end up with k,and val representing the same thing as k,v did originally. Instead we need to go over the key,value pairs in the dictionaries. But seeing as we need to cope with n dictionaries, we need to rethink the function definition.

Hence:

def reducefn(*dicts):

When you invoke the function like this:

reducefn({'physics': 1},{'volume': 1, 'chemistry': 1},{'chemistry': 1})

*dicts collects the arguments, in such a way that dicts ends up as:

({'physics': 1}, {'volume': 1, 'chemistry': 1}, {'chemistry': 1})

As you can see, the three dictionaries passed into the function were collected into a tuple. Now we iterate over the tuple:

for dic in dicts:

So now, on each iteration, dic is one of the dictionaries we passed in, so now we go ahead and print out the key,value pairs inside it:

for k,v in dic.items(): 
    print(k,v) 
HennyH
  • 7,794
  • 2
  • 29
  • 39
0

Use zip

def reducefn(k, v):
    for k,val in zip(k,v):
        print k, v


>>> reducefn({'practical': 1, 'volume': 1, 'physics': 1} ,{'practical': 1, 'volume': 1,     'chemistry': 1})

practical {'practical': 1, 'volume': 1, 'chemistry': 1}
volume {'practical': 1, 'volume': 1, 'chemistry': 1}
physics {'practical': 1, 'volume': 1, 'chemistry': 1}
>>> 

reducefn(k,v) : constitutes a tuple of tuples ((k1,k2,k3..), (v1,v2,v3...))

zippping them gives you ((k1,v1), (k2,v2), (k3,v3)...) and thats what you want

jamylak
  • 128,818
  • 30
  • 231
  • 230
Bhavish Agarwal
  • 663
  • 7
  • 13