2

I understand the usage of map for a pd.Series and apply for a pd.DataFrame, but what is the difference between using map and apply for a pd.Series ? It seems to me that they essentially do the same thing:

>>> df['title'].map(  lambda value: str(value) + 'x')
>>> df['title'].apply(lambda value: str(value) + 'x')

It seems both just send a value to a function/map. Is there an actual difference between the two, and if so what would be an example showing it? Or, are these interchangeable when applied to pd.Series ?


For reference, from the docs:

For the examples map uses a dict and apply uses a func, but really, they seem the same? Both can use a function.

David542
  • 104,438
  • 178
  • 489
  • 842

1 Answers1

1

The See also paragraph of Series.map says that Series.apply is For applying more complex functions on a Series.

Series.map if for a one to one relation, that can be represented by a dictionary or a function of one parameter returning one value.

Series.apply can use functions returning more than one single parameter (in fact a whole Series). In that case, the result of Series.apply will be a DataFrame.

Said differently you can always use apply where you use map. If you pass a dict (say d) to map, you can pass a trivial lambda to apply: lambda x: d[x]. But if you use apply to transform a Series into a DataFrame, then map cannot be used.

As a result, map is likely to be more optimized that apply for one to one transformation, and should be used instead of apply wherever possible.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252