10

I thought I understood map vs applymap pretty well, but am having a problem (see here for additional background, if interested).

A simple example:

df  = pd.DataFrame( [[1,2],[1,1]] ) 
dct = { 1:'python', 2:'gator' }

df[0].map( lambda x: x+90 )
df.applymap( lambda x: x+90 )

That works as expected -- both operate on an elementwise basis, map on a series, applymap on a dataframe (explained very well here btw).

If I use a dictionary rather than a lambda, map still works fine:

df[0].map( dct )

0    python
1    python

but not applymap:

df.applymap( dct )
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-100-7872ff604851> in <module>()
----> 1 df.applymap( dct )

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in applymap(self, func)
   3856                 x = lib.map_infer(_values_from_object(x), f)
   3857             return lib.map_infer(_values_from_object(x), func)
-> 3858         return self.apply(infer)
   3859 
   3860     #----------------------------------------------------------------------

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   3687                     if reduce is None:
   3688                         reduce = True
-> 3689                     return self._apply_standard(f, axis, reduce=reduce)
   3690             else:
   3691                 return self._apply_broadcast(f, axis)

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in _apply_standard(self, func, axis, ignore_failures, reduce)
   3777             try:
   3778                 for i, v in enumerate(series_gen):
-> 3779                     results[i] = func(v)
   3780                     keys.append(v.name)
   3781             except Exception as e:

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in infer(x)
   3855                 f = com.i8_boxer(x)
   3856                 x = lib.map_infer(_values_from_object(x), f)
-> 3857             return lib.map_infer(_values_from_object(x), func)
   3858         return self.apply(infer)
   3859 

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\lib.pyd in pandas.lib.map_infer (pandas\lib.c:56990)()

TypeError: ("'dict' object is not callable", u'occurred at index 0')

So, my question is why don't map and applymap work in an analogous manner here? Is it a bug with applymap, or am I doing something wrong?

Edit to add: I have discovered that I can work around this fairly easily with this:

df.applymap( lambda x: dct[x] )

        0       1
0  python   gator
1  python  python

Or better yet via this answer which requires no lambda.

df.applymap( dct.get )

So that is pretty much exactly equivalent, right? Must be something with how applymap parses the syntax and I guess the explicit form of a function/method works better than a dictionary. Anyway, I guess now there is no practical problem remaining here but am still interested in what is going on here if anyone wants to answer.

Community
  • 1
  • 1
JohnE
  • 29,156
  • 8
  • 79
  • 109
  • 1
    df.applymap() dont apply .map() on each Series of the DataFrame, put map .apply() on each Series. See Series .apply() here: [link](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.apply.html) And .apply() need a function as argument and cannot take a dictionnary as .map() can do. – Data_addict May 27 '15 at 16:13
  • 1
    Sorry, I really don't understand what you are saying here. I guess that applymap and map are not equivalent, which I don't dispute, but I don't have any better understanding as to the why or how. To quote from the link above (to a very popular SO answer): "applymap works element-wise on a DataFrame, and map works element-wise on a Series." I am hoping for some elaboration on that point. – JohnE May 27 '15 at 17:42

1 Answers1

6

.applymap() and .map() is true to work element-wise. But .applymap() doesn't take every columns and do .map() on those, but do .apply() on each of those.

So when you call df.applymap(dct): What happend is df[0].apply(dct), not df[0].map(dct)

And here what is the difference between this two Series methods:

.map() accept Series, dict and function (any callable, so methods like dict.get work too) as first argument; as .apply() only accept function(or any callable) as first argument.

.map() contains if statement to figure out if the first argument passed is a dict, a Series or a function and act proprely depending of the input. When you pass a function to .map(), the .map() method do the same things as .apply().

But .apply() don't have those if statements that allow it to deal proprely with dictionnary and Series. It only know how to work with callable.

When you call .apply() or .map() with a function they both end calling lib.map_infer(), who look like acting like the map() function of python (but Im enable to put my hand on the source code so Im not completly sure).

Doing map(dct, df[0]) will give you the same error as df.applymap(dct) and df[0].apply(dct) will also give the same error.

Now, you can ask why using .apply() instead of .map(), if .map() do the same thing when called with a function and can take dict and Series?

Because .apply() can return you a Dataframe if the result of the function you pass to it is a Series.

ser = pandas.Series([1,2,3,4,5], index=range(5))

ser_map = ser.map(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_map)
pandas.core.series.Series

ser_app = ser.apply(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_app)
pandas.core.frame.DataFrame
Data_addict
  • 300
  • 1
  • 9
  • Thanks! The example showing how map produces a series and apply produces a dataframe also explains some results I'd gotten in the past and not understood. My understanding of all this is still a bit less than 100%, but this helps. – JohnE May 29 '15 at 03:20