I've got a DataFrame where I need to process a value on each row by passing it to an external function that I don't have control over, and I'd like to do it as fast as possible (limited at 20req/s by an external API)
d1 = {1:['Test','Test1','Test2'], 2:['file1','file2','file3'],3:[pd.NA,pd.NA,pd.NA]}
df = pd.DataFrame(data=d1)
1 2 3
0 Test file1 <NA>
1 Test1 file2 <NA>
2 Test2 file3 <NA>
What would be the best way for me to send values from column2 to function process_file(), saving the returned value in column3 and doing this in parallel to go as fast as possible (keeping the limit in mind). My first point of call would usually be asyncio, however in this case since process_file() is not asyncio enabled, I'm stuck.
Anyone?