I am trying to do something like an SQL window function in Python 3.6. I have created the following code which I found here, but I'm getting the following error:
"ValueError: cannot reindex from a duplicate axis"
df = pd.DataFrame({'id' : ['daeb21718d5a','daeb21718d5a','daeb21718d5a'],
'product_id' : [123,456,789],
'probability' : [0.076838,0.053384, 0.843900 ]})
df['rank'] = df.sort_values(['probability'], ascending=False) \
.groupby(['id']) \
.cumcount() + 1
Weirdly if I add .reset_index(drop=True)
before grouping, the error is fixed.