I am currently performing a row-wise calculation on a pd.DataFrame
using df.apply(foo)
, where foo
is effectively as follows:
def foo(row):
n = row['A']
d = row['B']
if n <= 0:
return 0
if d <= 0:
return 100
return n / d * 100
This seems to be begging to be simplified into an np.where
.
I have other cases with only one if
statement (i.e. if n <= 0
), which I have already simplified into
np.where(df['A'] <= 0, 0, df['A'] / df['B'])
However, I can't see how to do the same with the double-if case. At least not elegantly. I could do
np.where(df['A'] <= 0, 0, np.where(df['B'] <= 0, 100, n / d * 100))
But this would seem to run through the entire dataframe twice, once for each np.where
call.
Is there a better way of doing things? Or, alternatively, is the use of np.where
and the vectorization it brings so great that running through the table twice with np.where
is still better than only once with pd.apply
?