What is the difference between Pandas Series.apply() and Series.map()?

Question

Map values of Series using input correspondence (which can be a dict, Series, or function)

Invoke function on values of Series. Can be ufunc (a NumPy function that applies to the entire Series) or a Python function that only works on single values

apply() seems like it does mostly everything map() does, vectorizing scalar functions while applying vectorized operations as they are. Meanwhile map() allows for some amount of control over null value handling. Apart from historical analogy to Python's apply() and map() functions, is there a reason to prefer one over the other in general use? Why wouldn't these functions just be combined?

afaik Series.map(func) cannot pass additional arguments to func. When you use Series.apply(func), you can do sr.apply(func, convert_dtype=True, arg2='foo', arg3=True), and whatever keyword arguments that Series.apply() doesn't recognize will be passed to func, in this case arg2='foo' and arg3=True. — lineil, Jul 28 '16 at 18:59
@xg.plt.py the context of that other question is dataframes rather than series objects (and so the similarity is more profound in this case) — benjimin, Apr 01 '19 at 23:58

score 6 · Accepted Answer · edited Nov 25 '20 at 13:29

The difference is subtle:

pandas.Series.map will substitute the values of the Series by what you pass into map.

pandas.Series.apply will apply a function (potentially with arguments) to the values of the Series.

The difference is what you can pass to the methods

both map and apply can receive a function :

s = pd.Series([1, 2, 3, 4])

def square(x):
     return x**2

s.map(square) 

0    1
1    2
2    3
3    4
dtype: int64

s.apply(square) 

0    1
1    2
2    3
3    4
dtype: int64

However, the function you pass into map cannot have more than one parameter (it will output a ValueError) :

def power(x, p):
    return x**p

s.apply(power, p=3)

0     1
1     8
2    27
3    64
dtype: int64


s.map(power,3)
---------------------------------------------------------------------------
ValueError

map can receive a dictionary (or even a pd.Series in which case it will use the index as key ) while apply cannot (it will output a TypeError)

dic = {1: 5, 2: 4}

s.map(dic)

0    5.0
1    4.0
2    NaN
3    NaN
dtype: float64

s.apply(dic)
---------------------------------------------------------------------------
TypeError  


s.map(s)

0    2.0
1    3.0
2    4.0
3    NaN
dtype: float64


s.apply(s)

---------------------------------------------------------------------------
TypeError

What is the difference between Pandas Series.apply() and Series.map()?

1 Answers1