0

I have a data set that tracks daily revenue by id, category and date:

id       cat        date     daily_rev
111       A        3/09/19     $10
111       A        3/10/19     $15
111       A        3/11/19     $40
222       A        3/09/19     $100
222       A        3/10/19     $150
222       A        3/11/19     $50
333       B        3/09/19     $45
333       B        3/10/19     $10
333       B        3/11/19     $30

I want to manipulate the data to sum across all dates by category:

cat     tot_daily_rev
 A          $365
 B          $85

When I use this code:

X <- data %>% group_by(cat) %>% mutate(tot_daily_rev = sum(daily_rev))

I get a data frame that has a tot_daily_rev column that is a sum of every row in the data set:

id       cat        date     daily_rev     tot_daily_rev
111       A        3/09/19     $10              $450
111       A        3/10/19     $15              $450
111       A        3/11/19     $40              $450
222       A        3/09/19     $100             $450
222       A        3/10/19     $150             $450
222       A        3/11/19     $50              $450
333       B        3/09/19     $45              $450
333       B        3/10/19     $10              $450
333       B        3/11/19     $30              $450

I've already referenced this post: How to sum a variable by group?, but it does not solve my issue.

--

Update

Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`? addresses the same issue! I was completely unaware that this was an issue of functions/libraries, so I didn't think to search for why summarize and mutate were not behaving as I expected.

Mercy
  • 9
  • 5
  • Use `summarize()` instead of `mutate()`. – jdobres Mar 12 '19 at 17:08
  • 1
    Is there any chance that you've also loaded the `plyr` package? If you have, its version of `mutate` does not respect the groups created by `group_by`. Specify `dplyr::mutate` or `dplyr::summarize` in your code to be sure the right version is being used. – jdobres Mar 12 '19 at 17:10
  • thanks @jdobres, i forgot to mention that i've tried `summarize()`. when i use `summarize()`, it returns a 1 x 1 data frame with the value $450. – Mercy Mar 12 '19 at 17:11
  • Again, be sure that you're calling dplyr's version of `mutate` with `dplyr::mutate`. – jdobres Mar 12 '19 at 17:12
  • @jdobres thank you! i didn't know that about the difference between plyr::mutate and dplyr::mutate. completely solved my issue!! – Mercy Mar 12 '19 at 17:23

1 Answers1

0

Its not mutate when you're using group_by. After you've used group_by its now a group_by object, so you have to use summarize

X <- data %>% 
  group_by(cat) %>%
  summarize(tot_daily_rev = sum(daily_rev))
Matt W.
  • 3,692
  • 2
  • 23
  • 46
  • This is incorrect, `summarize` is used if you want one row per group. `mutate` is used if you want to keep the original number of rows. In both cases, operations will still be done "by group". (It does look like OP wants `summarize`, not `mutate`, but there is a time and place for `mutate`, and it still works just fine after `group_by`). – Gregor Thomas Mar 12 '19 at 17:30