I have a dataframe with sales. I need to aggregate the df by 2 columns ProductID
and Day
and sum the values of each aggregated group from a different column Amount
so that it now shows the total. I wish to keep the other columns that can be grouped as well (same values across rows), in this case just Product
. The final column Store
won't be kept, since values can vary within grouped rows. However, I need to add a column UniqueStores
, which counts the amount of unique stores for each group of same ProductID and Day. For example, the first group with ID=1 and Day= Monday would have 1 unique store "N", so value would be 1.
I tried drafting the table here in text but I couldn't format it correctly, so here as an image of how it looks before aggregating:
I've tried aggregating with both group_by + summarise and df[,sum,by] but they don't keep the variables that aren't given as indexes. Is there a workaround without having to manually insert every column that shall remain?
Thanks in advance and I hope I made myself clear.
Input values:
df <- data.frame("ProductID" = c(1,1,1,1,2,2,2,2), "Day"=c("Monday","Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Friday", "Friday"), "Amount"=c(5,5,3,7,6,9,5,2), "Product"=c("Food","Food","Food","Food","Toys","Toys","Toys","Toys"), "Store"=c("N","N","W","N", "S","W", "S","S"))