I'm relatively new to using Pandas, I have a dataframe that looks like the below, my goal is to replicate the "desired_output" column:
+---------+--------+-------------+-------+------------+----------------+
| Main ID | Sub ID | Senior Flag | order | Dollar Amt | desired_output |
+---------+--------+-------------+-------+------------+----------------+
| 1 | A | Y | 1 | 10 | 8 |
| 1 | B | N | 2 | 5 | 3 |
| 1 | C | N | 3 | 3 | 0 |
| 2 | A-1 | Y | 1 | 15 | 13 |
| 2 | A-2 | Y | 2 | 10 | 13 |
| 2 | B | N | 3 | 7 | 6 |
| 2 | C | N | 4 | 6 | 0 |
+---------+--------+-------------+-------+------------+----------------+
The logic I would like to create in Python for the desired_output column is below:
1) Group by the Main IDs
2) For all of the Sub IDs check if the Senior Flag is Y, if it is only take the sum of the Sub IDs where the Senior Flag is N.
3) For the Sub IDs where the Senior Flag is N, the sum should be of anything where the order is greater than it's current order (e.g. for Main ID 2, The B Sub ID would take the value for C SubID since it's order is greater than B)
I'm sorry I know typically users would want a sample of the code, but the dataframe is much larger than this so for simplicity sake I'm only using the above table.
I appreciate all the Help!!