0

If I have

year       veg  number
2017 aubergine       3
2017    tomato      13
2017  eggplant       4
2018 aubergine       1
2018    tomato      17
2018  eggplant       3

how can I sum the data for aubergine and eggplant for each year to get

year       veg  number
2017 aubergine       7
2017    tomato      13
2018 aubergine       4
2018    tomato      17

?

loris
  • 450
  • 8
  • 20

2 Answers2

4

You could overwrite eggplant with aubergine and then aggregate number by veg and year.

x  <- read.table(header=TRUE, text="year       veg  number
2017 aubergine       3
2017    tomato      13
2017  eggplant       4
2018 aubergine       1
2018    tomato      17
2018  eggplant       3")

x$vegb  <- x$veg
x$vegb[x$vegb == "eggplant"]  <- "aubergine"

aggregate(number ~ vegb + year, data=x, FUN=sum)
#       vegb year number
#1 aubergine 2017      7
#2    tomato 2017     13
#3 aubergine 2018      4
#4    tomato 2018     17
GKi
  • 37,245
  • 2
  • 26
  • 48
  • Creating a new column is a nice idea and one I'll try to remember, although using `transform` as in the second below example https://stackoverflow.com/a/57772285/1409644 seems to me to be a bit more elegant in this case. – loris Sep 03 '19 at 13:29
1

One way would be to replace "eggplant" to "aubergine", then group_by year and veg and take sum.

library(dplyr)
df %>%
  mutate(veg = replace(veg, veg == "eggplant", "aubergine")) %>%
  group_by(year, veg) %>%
  summarise(number = sum(number))

#   year veg       number
#  <int> <fct>      <int>
#1  2017 aubergine      7
#2  2017 tomato        13
#3  2018 aubergine      4
#4  2018 tomato        17

In base R, that can be done with transform and aggregate

aggregate(number~year + veg, 
    transform(df, veg = replace(veg, veg == "eggplant", "aubergine")), sum)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • For my case the approach with `transform` and `aggregate` seems the most appropriate for my humble needs. The solution with `dplyr` is interesting, although I think it is unfortunate the documentation for `mutate` here https://dplyr.tidyverse.org/reference/mutate.html just uses concepts like`tibble` without providing a link. – loris Sep 03 '19 at 13:41