1

I searched a lot for this answer, but I couldn't find anything that would help me, so here it goes:

I need to change the value on a certain group of rows based on the value from another group. My dataframe is organized like this:

df <- data.frame(group = c("Apple Juice", "Apple Juice", "Apple Juice", "Bottle", "Bottle", "Bottle"), month = c(1,2,3,1,2,3), cost = c(10,10,10,4,4,4))
group month cost
Apple Juice 1 10
Apple Juice 2 10
Apple Juice 3 10
Bottle 1 4
Bottle 2 4
Bottle 3 4

In which the cost of the bottle is a part of the apple juice's cost, but I need to take this value off for the second part of the job, while keeping the same structure:

group month cost
Apple Juice 1 6
Apple Juice 2 6
Apple Juice 3 6
Bottle 1 4
Bottle 2 4
Bottle 3 4

Usually I do this with dplyr and it used to work:

df <- df %>%
  group_by(group , month) %>%
  dplyr::summarise(cost = -cost[group == 'Bottle'], group = 'Apple Juice') %>%
  ungroup()

But for some reason when I'm running this code I'm getting this result:

group month cost
Apple Juice 1 -4
Apple Juice 2 -4
Apple Juice 3 -4

How can I fix this?

Drahal
  • 37
  • 3

2 Answers2

1

We could only group by 'month' and get the difference

library(dplyr)
df %>%
   group_by(month) %>%
   mutate(cost = replace(cost, group != 'Bottle', diff(cost[2:1]))) %>% 
   ungroup

-output

# A tibble: 6 × 3
  group       month  cost
  <chr>       <dbl> <dbl>
1 Apple Juice     1     6
2 Apple Juice     2     6
3 Apple Juice     3     6
4 Bottle          1     4
5 Bottle          2     4
6 Bottle          3     4
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    This worked wonderfully to the dataframe I had, thank you very much! I'm having some trouble trying to adapt to other dataframes with more groups and columns, but the other code I was using is working for those cases. – Drahal Sep 13 '22 at 18:42
  • 1
    @Drahal I was using `diff` as I find only two groups. If you are using a default group as 'Bottle', then instead of `diff` can subset the `cost` where `group == 'Bottle'` and use `-` (depends on the number of groups) – akrun Sep 13 '22 at 18:43
1

The solution from akrun is a little more elegant but throwing my hat in the ring.

  df%>%
  pivot_wider(id_col = month,
              names_from = group,
              values_from = cost)%>%
  mutate(`Apple Juice` = `Apple Juice` - Bottle)%>%
  gather(key = 'group',
         value = 'cost',
         -c("month")) -> df_transformed
Araph7
  • 71
  • 4