3

let's compare all those that work in all cases

Initial DataFrame:

arr = np.random.randint(10, 50, size=(1000, 1000))
df = pd.DataFrame(arr)

Apply:

%%timeit
df.apply(lambda x: x**3)
329 ms ± 117 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Transform:

%%timeit
df.transform(lambda x: x**3)
352 ms ± 48 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Applymap (very bad):

%%timeit
df.applymap(lambda x: x**3)
1.07 s ± 59.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Applymap is the least flexible and the slowest then why does it exist?

  • 1
    Does [this](https://stackoverflow.com/questions/19798153/difference-between-map-applymap-and-apply-methods-in-pandas) answer your question? – Michael Wheeler Oct 17 '21 at 13:49
  • 2
    `applymap` is certainly not "the least flexible". It's flexibility is what makes it impossible to vectoruze. – rici Oct 17 '21 at 13:56

1 Answers1

2

Just a note here. transform and apply are processing the dataframe column wise, whereas df.applymap processes element-wise. Therefore for other problems applymap would be the only pandas based solution.


Implementation

applymap is virtually apply(lambda x: x.map(func)).

Not exactly, since pandas uses private methods within the modules.

The timing result from applymap and apply with map are similar:

>>> timeit(lambda: df.applymap(lambda x: x ** 2), number=10)
3.8224549000005936
>>> timeit(lambda: df.apply(lambda x: x.map(lambda y: y ** 2)), number=10)
4.243166700000074
>>> 

Faster solutions:

But yes, in this question there is no arguments of which is the fastest:

>>> timeit(lambda: df ** 2, number=10)
0.016250700000455254
>>> 

But for regular cases, where you need to apply element-wise, np.vectorize is also a solution:

df[:] = np.vectorize(lambda x: x ** 2)(df)

Timings (without assignment):

>>> timeit(lambda: np.vectorize(lambda x: x ** 2)(df), number=10)
2.313548300000548
>>> 

Examples where only applymap would work, but apply and transform wouldn't:

Ex for dividing them by 100 and converting numbers to character by the unicode number:

>>> df.applymap(lambda x: chr(x // 100))
    0   1   2   3   4   5   6   7   8   9   10  11  12  13   ... 986 987 988 989 990 991 992 993 994 995 996 997 998 999
0     ـ   ⹓   禌   㪞   ག   Ǻ   䤵   ʏ   콜   ̓   鉨   ␓   ೵   ᓂ  ...   ̓   ૮   ␓   Ï   Ǻ   搀   䤵   ᮠ   Й   Й   ޘ   Ï   ૮   薌
1     ೵   ʏ   婞   Й   뺜   薌   ـ   ᾤ   ೵   ᾤ   ĝ   ೵   ⣵   ೵  ...   婞   ␓   关         ⣵   콜   ĝ   ૮   关      䤵   ᠂   ૮
2     婞   婞   ૮   䆜   ᾤ   뺜   Ǻ   㐳   ᇙ   Й   Ǻ   콜   ⹓   ĝ  ...   ೵   Ï   禌   ૮   ૮   ག   ƀ   콜   ⹓   湡   ޘ   Ï   㪞   禌
3     ⣵   ᇙ   ␓   뺜   Ï         Ï   鉨   㪞   콜   ꀮ   뺜   禌  ...   薌   ⣵   湡   婞   婞   㐳   Ǻ   ᇙ   d   d   ـ   ೵   鉨   ᓂ
4     ೵   द   ޘ   ⹓   ƀ   薌   ԗ   ག   ԗ   ૮   ԗ   ᮠ   ꀮ   䆜  ...   ޘ   ᇙ   㐳   ʏ   ᮠ   ⹓   搀   d   婞   ԗ   禌   Й   콜   ʏ
..   ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ...  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
995   Ǻ   ĝ   껦   禌   ĝ   搀   ƀ   薌   ᇙ   ĝ   뺜   ᾤ   ૮   ᇙ  ...   䤵   द   ૮   द   ꀮ   婞   䤵   ĝ   ␓   薌   ␓   ƀ   ̓   鉨
996   婞   ૮   Й   薌   ـ   ག   湡   䤵   ␓   뺜   ␓   ޘ   婞   ꀮ  ...   뺜   ⣵   䤵      뺜   ೵   ೵      ⹓   ĝ   薌      ̓   d
997   䤵   Ï   ƀ   Й      ƀ   䤵   ƀ   ᓂ   ⣵      ᮠ   䤵   ૮  ...   䤵   Й   Ï   द   鉨   㪞   ⹓   ␓   关   Ǻ      ĝ   ᇙ   द
998   ʏ   ⹓   d   द   d   㪞      ꀮ   d   薌   薌   ᠂   ƀ   ̓  ...   ᮠ   ᓂ   ĝ   ག   䤵   㐳   ʏ   ⣵   㐳   ʏ   ᇙ   搀   ᓂ   ̓
999   ޘ      䤵   Ǻ   껦   ̓   ԗ   ĝ   ƀ   ꀮ   㐳   湡   搀   ⹓  ...   ޘ   搀   䤵   湡   ૮   鉨   ޘ   Ï   㐳   ƀ      禌   㪞   ـ

[1000 rows x 1000 columns]
>>> df.apply(lambda x: chr(x // 100))
TypeError: cannot convert the series to <class 'int'>
>>> 

np.vectorize would work here too:

>>> df[:] = np.vectorize(lambda x: chr(x // 100))(df)
>>> df
    0   1   2   3   4   5   6   7   8   9   10  11  12  13   ... 986 987 988 989 990 991 992 993 994 995 996 997 998 999
0     ـ   ⹓   禌   㪞   ག   Ǻ   䤵   ʏ   콜   ̓   鉨   ␓   ೵   ᓂ  ...   ̓   ૮   ␓   Ï   Ǻ   搀   䤵   ᮠ   Й   Й   ޘ   Ï   ૮   薌
1     ೵   ʏ   婞   Й   뺜   薌   ـ   ᾤ   ೵   ᾤ   ĝ   ೵   ⣵   ೵  ...   婞   ␓   关         ⣵   콜   ĝ   ૮   关      䤵   ᠂   ૮
2     婞   婞   ૮   䆜   ᾤ   뺜   Ǻ   㐳   ᇙ   Й   Ǻ   콜   ⹓   ĝ  ...   ೵   Ï   禌   ૮   ૮   ག   ƀ   콜   ⹓   湡   ޘ   Ï   㪞   禌
3     ⣵   ᇙ   ␓   뺜   Ï         Ï   鉨   㪞   콜   ꀮ   뺜   禌  ...   薌   ⣵   湡   婞   婞   㐳   Ǻ   ᇙ   d   d   ـ   ೵   鉨   ᓂ
4     ೵   द   ޘ   ⹓   ƀ   薌   ԗ   ག   ԗ   ૮   ԗ   ᮠ   ꀮ   䆜  ...   ޘ   ᇙ   㐳   ʏ   ᮠ   ⹓   搀   d   婞   ԗ   禌   Й   콜   ʏ
..   ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ...  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
995   Ǻ   ĝ   껦   禌   ĝ   搀   ƀ   薌   ᇙ   ĝ   뺜   ᾤ   ૮   ᇙ  ...   䤵   द   ૮   द   ꀮ   婞   䤵   ĝ   ␓   薌   ␓   ƀ   ̓   鉨
996   婞   ૮   Й   薌   ـ   ག   湡   䤵   ␓   뺜   ␓   ޘ   婞   ꀮ  ...   뺜   ⣵   䤵      뺜   ೵   ೵      ⹓   ĝ   薌      ̓   d
997   䤵   Ï   ƀ   Й      ƀ   䤵   ƀ   ᓂ   ⣵      ᮠ   䤵   ૮  ...   䤵   Й   Ï   द   鉨   㪞   ⹓   ␓   关   Ǻ      ĝ   ᇙ   द
998   ʏ   ⹓   d   द   d   㪞      ꀮ   d   薌   薌   ᠂   ƀ   ̓  ...   ᮠ   ᓂ   ĝ   ག   䤵   㐳   ʏ   ⣵   㐳   ʏ   ᇙ   搀   ᓂ   ̓
999   ޘ      䤵   Ǻ   껦   ̓   ԗ   ĝ   ƀ   ꀮ   㐳   湡   搀   ⹓  ...   ޘ   搀   䤵   湡   ૮   鉨   ޘ   Ï   㐳   ƀ      禌   㪞   ـ

[1000 rows x 1000 columns]
>>> 
U13-Forward
  • 69,221
  • 14
  • 89
  • 114