-1

I have the dataframe below.

year<-c(2016,2016,2017,2017,2016,2016,2017,2017)
city<-c("NY","NY","NY","NY","WS","WS","WS","WS")
spec<-c("df","df","df","df","vb","vb","vb","vb")
num<-c(45,67,89,90,45,67,89,90)
df<-data.frame(year,city,spec,num)

I would like to know if it is possible to sum the num based on year,city and spec columns in order to bring it from this form:

year city spec num
1 2016   NY   df  45
2 2016   NY   df  67
3 2017   NY   df  89
4 2017   NY   df  90
5 2016   WS   vb  45
6 2016   WS   vb  67
7 2017   WS   vb  89
8 2017   WS   vb  90

to this:

year city spec num
1 2016   NY   df 112
2 2017   NY   df 179
3 2016   WS   vb 112
4 2017   WS   vb 179
firmo23
  • 7,490
  • 2
  • 38
  • 114

2 Answers2

1

One way is to use sqldf package:

sqldf("Select year, city, spec, sum(num) from df 
      group by year, city, spec order by city")

  year city spec sum(num)
1 2016   NY   df      112
2 2017   NY   df      179
3 2016   WS   vb      112
4 2017   WS   vb      179

Using dplyr

df %>% 
  group_by(year, city, spec) %>% 
  summarise(SumNum = sum(num)) %>% 
  arrange(city)
Jason Mathews
  • 765
  • 3
  • 13
1

Possible duplicate, but here is an answer:

library(tidyverse)

df %>%
  group_by(year,city,spec) %>%
  summarise(sum = sum(num))

...results in ...

# A tibble: 4 x 4
# Groups:   year, city [4]
   year city  spec    sum
  <dbl> <fct> <fct> <dbl>
1  2016 NY    df      112
2  2016 WS    vb      112
3  2017 NY    df      179
4  2017 WS    vb      179
Marian Minar
  • 1,344
  • 10
  • 25