Pandas Apply processing more rows than existing dataframe

Question

I have a Pandas dataframe very simple it's shape is (140, 10) but when I use:

df.apply(lambda row: somefunction(row, otherparameter), axis=1)

It's doubling the first row on the dataframe, to prove this I used a print inside of somefunction that prints the row.

the only strange thing that I am doing inside somefunction is inserting a record to a database (but the print is before this instruction so it has nothing to do with it or I believe so). there is some reason why I don't want to use to_sql function but is another option.

when I check the dataframe shape after the apply line, the shape continues to be the same.

I would like to know possible causes of this

[This question has already been answered](https://stackoverflow.com/questions/31877909/pandas-function-dataframe-apply-runs-top-row-twice), but I would like to know any workaround to this behaviour — Carlos, Jan 30 '20 at 15:14

score 0 · Accepted Answer · answered Jan 30 '20 at 19:14

0

It will most likely work slower than "ordinary" apply, but you can try iterrows(). Something like:

for ind, row in df.iterrows():
    somefunction(row, otherparameter)

The first result from iterrows is the index of the current row. If you don't need it, replace ind with _.

answered Jan 30 '20 at 19:14

Valdi_Bo

30,023
4
23
41

I will mark this as the accepted answer, I'm a bit concerned about performance here but your solution works – Carlos Jan 31 '20 at 16:46

Pandas Apply processing more rows than existing dataframe

1 Answers1