I want to create a new column that sums up the value column based on groupings of multiple columns. In this example I want to get the sum per ISIN, date and portfolio.
df = pd.DataFrame({"ISIN": ["IS123", "IS123", "UN123", "UN123", "FA123"],
"date": ["16", "16", "18", "18", "22"],
"portfolio": ["A", "A", "B", "A", "D"],
"value": [400, 300, 200, 600, 500]})
Here is the desired output. As you can see, only the first two rows "satisfies" the condition and both rows get the sum of 700. The others will keep their respective value.
df = pd.DataFrame({"ISIN": ["IS123", "IS123", "UN123", "UN123", "FA123"],
"date": ["16", "16", "18", "18", "22"],
"portfolio": ["A", "A", "B", "A", "D"],
"value": [400, 300, 200, 600, 500],
"Sum per ISIN, date and portfolio": [700, 700, 200, 600, 500]})
Here is what I have tried, but I am only able to make it work with grouping on one column, for example just ISIN.
df["Sum per ISIN, date and portfolio"] = df["value"].groupby(df["ISIN", "date", "portfolio"]).transform("sum")