If I have a function
def do_irreversible_thing(a, b):
print a, b
And a dataframe, say
df = pd.DataFrame([(0, 1), (2, 3), (4, 5)], columns=['a', 'b'])
What's the best way to run the function exactly once for each row in a pandas dataframe. As pointed out in other questions, something like df.apply pandas will call the function twice for the first row. Even using numpy
np.vectorize(do_irreversible_thing)(df.a, df.b)
causes the function to be called twice on the first row, as will df.T.apply()
or df.apply(..., axis=1).
Is there a faster or cleaner way to call the function with every row than this explicit loop?
for idx, a, b in df.itertuples():
do_irreversible_thing(a, b)