I have been stuck for 3 hours on this problem. I have a DF like that : p = product order = number of sales
I don't have the release date of the product so I assume that the release date is the first date with some sales.
Here is my dataframe :
p order
A 0
A 0
A 1
A 1
A 2
B 0
B 0
B 1
B 1
this is what I would like : an incrementation of days since release on columns d_s_r (days since release).
p order d_s_r
A 0 0
A 0 0
A 1 1
A 1 2
A 2 3
B 0 0
B 0 0
B 1 1
B 1 2
What would be your recommendation :
I tried :
for i, row in data[data.order > 0].groupby('p') :
list_rows = row.index.tolist()
for m, k in enumerate(list_rows):
data.loc[k,'s_d_r'] = m +1
seems to be working but it takes too much time....
i'm sure there is an easy way but can't find id. thanks in advance...
Edit :
Here's my df :
df = pd.DataFrame([['A',0,0],['A',0,0],['A',12,1],['A',23,5],['A',25,7]
,['B',0,0],['B',2,0],['B',8,5],['B',15,12],['B',0,3],['B',0,3],['B',5,4]], columns=['prod','order','order_2'])
with the df.groupby('prod')['order'].transform(lambda x : x.cumsum().factorize()[0])
I get :
prod order order_2 d_s_r
0 A 0 0 0
1 A 0 0 0
2 A 12 1 1
3 A 23 5 2
4 A 25 7 3
5 B 0 0 0
6 B 2 0 1
7 B 8 5 2
8 B 15 12 3
9 B 0 3 3
10 B 0 3 3
11 B 5 4 4
When I would like :
prod order order_2 d_s_r
0 A 0 0 0
1 A 0 0 0
2 A 12 1 1
3 A 23 5 2
4 A 25 7 3
5 B 0 0 0
6 B 2 0 1
7 B 8 5 2
8 B 15 12 3
9 B 0 3 4
10 B 0 3 5
11 B 5 4 6
generally have 0's at the beginning of each groupby.('p') but i could eventually have directly some actual values. And I can, have 0 order some day(which put's back the counter to 0 here), but still want my counter since release date of product
I actually managed to get my results by adding a dummy column with only "1" and by doing df[df.o' > 0].groupby('p').cumsum() but I don't think it's really interesting...