df.iloc[0:1,:].apply(func, axis=1, x,y,z) executes func() 2 times

Question

I have a dataframe df that has thousands of rows.

For each row I want to apply function func.

As a test, I wanted to run func for only the first row of df. In func() I placed a print statement. I realized that the print statement was run 2 times even though I am slicing df to one row (there is an additional row for columns but those are columns).

When I do the following

df[0:1].apply(func, axis=1, x,y,z)

or

df.iloc[0:1,:].apply(func, axis=1, x,y,z)

The print statement is run 2 times, which means func() was executed twice.

Any idea why this is happening?

Possible duplicate of [Why does pandas apply calculate twice](http://stackoverflow.com/questions/21635915/why-does-pandas-apply-calculate-twice) — root, Apr 14 '16 at 18:18

score 0 · Accepted Answer · answered Apr 14 '16 at 17:57

0

The doc clearly says:

In the current implementation apply calls func twice on the first column/row to decide whether it can take a fast or slow code path.

answered Apr 14 '16 at 17:57

stellasia

5,372
4
23
43

Ah, okay so calls to func 2 times only for the first column/row (in my case row). – codingknob Apr 14 '16 at 17:59

MaxU - stand with Ukraine · Answer 2 · 2016-04-14T18:13:56.150

0

pay attention at different slicing techniques:

In [134]: df
Out[134]:
   a  b  c
0  9  5  4
1  4  7  2
2  1  3  7
3  6  3  2
4  4  5  2

In [135]: df.iloc[0:1]
Out[135]:
   a  b  c
0  9  5  4

In [136]: df.loc[0:1]
Out[136]:
   a  b  c
0  9  5  4
1  4  7  2

with printing:

print one row as Series:

In [139]: df[0:1].apply(lambda r: print(r), axis=1)
a    9
b    5
c    4
Name: 0, dtype: int32
Out[139]:
0    None
dtype: object

or using iloc:

In [144]: df.iloc[0:1, :].apply(lambda r: print(r), axis=1)
a    9
b    5
c    4
Name: 0, dtype: int32
Out[144]:
0    None
dtype: object

print two rows/Series:

In [140]: df.loc[0:1].apply(lambda r: print(r), axis=1)
a    9
b    5
c    4
Name: 0, dtype: int32
a    4
b    7
c    2
Name: 1, dtype: int32
Out[140]:
0    None
1    None
dtype: object

OP:

"the print statement was run 2 times even though I am slicing df to one row"

actually, you were slicing it into two rows

edited Apr 14 '16 at 18:13

answered Apr 14 '16 at 18:00

MaxU - stand with Ukraine

205,989
36
386
419

I'm not running df.loc[0:1]. I'm running df[0:1] which is = df.loc[0:1,:] – codingknob Apr 14 '16 at 18:01
Oops. I made a mistake in my OP. Its df.iloc[0:1,:] not df.loc[0:1,:] – codingknob Apr 14 '16 at 18:09
@codingknob, both `df[0:1].apply()` and `df.iloc[0:1, :].apply()` should work properly, as i described in my answer – MaxU - stand with Ukraine Apr 14 '16 at 18:12
Strange. The result of df[0:1] and df.iloc[0:1,:] for me is the first row of df in my case as a dataframe object and not a series. – codingknob Apr 14 '16 at 18:18
@codingknob, that's why you should always provide [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) when asking a question – MaxU - stand with Ukraine Apr 14 '16 at 18:20

df.iloc[0:1,:].apply(func, axis=1, x,y,z) executes func() 2 times

2 Answers2