0

I have a Pandas dataframe very simple it's shape is (140, 10) but when I use:

df.apply(lambda row: somefunction(row, otherparameter), axis=1)

It's doubling the first row on the dataframe, to prove this I used a print inside of somefunction that prints the row.

the only strange thing that I am doing inside somefunction is inserting a record to a database (but the print is before this instruction so it has nothing to do with it or I believe so). there is some reason why I don't want to use to_sql function but is another option.

when I check the dataframe shape after the apply line, the shape continues to be the same.

I would like to know possible causes of this

Carlos
  • 190
  • 8
  • [This question has already been answered](https://stackoverflow.com/questions/31877909/pandas-function-dataframe-apply-runs-top-row-twice), but I would like to know any workaround to this behaviour – Carlos Jan 30 '20 at 15:14

1 Answers1

0

It will most likely work slower than "ordinary" apply, but you can try iterrows(). Something like:

for ind, row in df.iterrows():
    somefunction(row, otherparameter)

The first result from iterrows is the index of the current row. If you don't need it, replace ind with _.

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41
  • I will mark this as the accepted answer, I'm a bit concerned about performance here but your solution works – Carlos Jan 31 '20 at 16:46