0

I made a function to decile in python like so:

ttl = df['total_mrk_vol'].sum()
result = []
for i in df['total_mrk_vol']:
    x = math.ceil(10*df['total_mrk_vol'].loc[df['total_mrk_vol'] <= i].sum()/ttl)
    result.append(x)
    df['total_decile_rank'] = result

On larger datasets this takes a long time to complete

Is there a way to make this faster/more efficient?

wwii
  • 23,232
  • 7
  • 37
  • 77
  • How about using `threading` or `asyncio` modules? – gtj520 Oct 31 '22 at 15:10
  • 1st option: try using `for row in df.iterrows` instead of `for i in df['total_mrk_vol']` -- it is much faster. 2nd option: do not iterate over dataframe but use `.apply()` [see this answer](https://stackoverflow.com/a/55557758/18406890) – artemonsh Oct 31 '22 at 15:12
  • [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – wwii Oct 31 '22 at 15:36
  • Does that do what you want it to? – wwii Oct 31 '22 at 15:38

0 Answers0