Consider I have the following:
Dataframe:
id createdId updatedId ownerId value
1 50 50 10 105
2 51 50 10 240
3 52 50 10 420
4 53 53 10 470
5 40 40 11 320
6 41 40 11 18
7 55 55 12 50
8 57 55 12 412
9 59 55 12 398
I am trying to sum the column 'value' in a new column 'output' ONLY if ownerId is the same AND if updatedId is less or equal to createdId
In my example, the output should be the below dataframe:
id createdId updatedId ownerId value output
1 50 50 10 105 105
2 51 50 10 240 345 # Add to the previous
3 52 50 10 420 765 # Add to the previous
4 53 53 10 470 1235 # Add to the previous
5 40 40 11 320 320 # Reset because Owner is different
6 41 40 11 18 338
7 55 55 12 50 50
8 57 55 12 412 462
9 59 55 12 398 860
I tried to do:
df['output'] = df[['value']].sum(axis=1).where(df['createdId'] > df['updatedId'], 0)
But this does not include the owner check and it seems not to be summing anything...
I am new with Panda, could you please show me how you would do this ?
EDIT 1:
I am trying to sum all the column 'value' in a new column 'output' from the range [updatedId, createdId] and only when OwnerId is the same.
Output:
id createdId updatedId ownerId value output
1 50 50 10 105 105
2 51 50 10 240 345 # Add to the previous
3 52 50 10 420 765 # Add to the previous
4 53 53 10 470 470 # Reset because no other value between 53 and 53
5 40 40 11 320 320 # Reset because Owner is different
6 41 40 11 18 338
7 55 55 12 50 50
8 57 55 12 412 462
9 59 55 12 398 860