Python Dataframe iterrows speed

Question

I'm sorting through stock transactions and learning python at the same time. I've read that iterrows isn't always the best, but I struggle to understand how to implement other solutions to my particular situation.

here's what I have, it works and it's faster than what I used to do, but I think it's still slow, what's the fastest way to do this:

data_list = [
        ['Dividend'],
        ['Reinvestment'],
        ['Sell'],
        ['Withholding']
]
df = pd.DataFrame(data_list,columns = ['buy/sell'])

buysell_list = [
    ['Dividend','Div'],
    ['Reinvestment','Div']
]
sort = pd.DataFrame(buysell_list,columns = ['0','1'])

import re
for _, row in sort.iterrows():
    #print(row[0],row[1])
    df.loc[df['buy/sell'].str.contains(row[0],flags=re.IGNORECASE),'buy/sell'] = row[1]

The result should look like this:
      buy/sell
0          Div
1          Div
2         Sell
3  Withholding

thanks for any tips!

Since `sort` has only 2 rows, the lacking performance of `.iterrows()` won't matter. Or is `sort` actually larger? If so, it would be good to know a bit more about its content. — Timus, Mar 29 '22 at 19:08
It will be larger, but not much, say a max of 20 rows. The df size can be larger, will that affect timing? I should add that, although it works well, I don't like my code, I find it hard to read. I would also add that I can see using this in other areas, and I'd like to use the most appropriate/clean method, I want to develop good habits since I'm trying to learn python. — noone, Mar 29 '22 at 22:06
You could try `df['buy/sell'] = df['buy/sell'].replace(dict(zip(sort['0'], sort['1'])))`. I think there was answer which suggested something similar, which now seems deleted. — Timus, Mar 29 '22 at 22:25
Timus, Thanks for the suggestion, it's easier to read, and from my limited testing, it also seems to be faster than what I was using. — noone, Mar 31 '22 at 16:58

Python Dataframe iterrows speed

0 Answers0