I have a dataset below where I need to subtract the premium to refund (Premium - Refund). From 2 duplicate (Product_Code) rows. I will get the premium from the row that has "New" Status and then the Refund will come from "Canceled" status. Please see my dataset below.
Product_Code <- c("1A","1B","1D","1A","1C","1D","1F","1G","1B","1H")
Status <- c("New", "New","New","Canceled","New","Canceled","New","New",
"Canceled", "New")
Premium <- c(1200,1500,2000,0,1000,0,1400,1600,0,1300)
Refund <- c(0,0,0,800,0,1500,0,0,900,0)
DataSet <- data.frame(Product_Code, Status, Premium, Refund).
> DataSet
Product_Code Status Premium Refund
1 1A New 1200 0
2 1B New 1500 0
3 1D New 2000 0
4 1A Canceled 0 800
5 1C New 1000 0
6 1D Canceled 0 1500
7 1F New 1400 0
8 1G New 1600 0
9 1B Canceled 0 900
10 1H New 1300 0
My desired outcome will be is to create a New Dataset. Where if the Product_Code has been duplicated (There is New and Canceled Status) the Premium from "New" Status will be subtracted by the Refund from the "Canceled" Status. And then collapsing to just one Product_Code removing the Canceled row. The new Premium will be Premium (from New Status) - Refund (from Canceled status). Please see desired output below.
> DataSet
Product_Code Status Premium Refund
1 1A New 400 0
2 1B New 600 0
3 1D New 500 0
4 1C New 1000 0
5 1F New 1400 0
6 1G New 1600 0
7 1H New 1300 0