How to do this question in R without using loop?

Question

Let's say I have a dataframe like this:

df=data.frame("A"=factor(c(1,2,1,4,2)), "date"=factor(c("1999","2000","1999","2001","2001")), "value"=c(10,20,30,40,50))

I need to sum the values in the column "value" if they have the same "A" and "date". So what I need is a dataframe like this:

dfnew=data.frame("A"=factor(c(1,2,1,4,2)), "date"=factor(c("1999","2000","1999","2001","2001")), "value"=c(10,20,30,40,50), "sum"=c(40,20,40,40,50))

I can do it with a loop, but it is very slow since my dataset is big. Is there any way to do it faster?

Do a group by sum `df %>% group_by(A, date) %>% mutate(sum = sum(value))` or iin `base R` `df$sum <-with(df, ave(value, A, date, FUN = sum))` or wiith `data.table` for faster `library(data.table); setDT(df)[, sum := sum(value), .(A, date)]` — akrun, Nov 27 '19 at 20:53

score 2 · Answer 1 · answered Nov 27 '19 at 20:55

Using dplyr, you can simply do:

library(dplyr)
df %>% group_by(date,A) %>% mutate(Sum = sum(value))

# A tibble: 5 x 4
# Groups:   date, A [4]
  A     date  value   Sum
  <fct> <fct> <dbl> <dbl>
1 1     1999     10    40
2 2     2000     20    20
3 1     1999     30    40
4 4     2001     40    40
5 2     2001     50    50

score 2 · Answer 2 · answered Nov 27 '19 at 20:57

2

We can use data.table

library(data.table)
setDT(df)[, sum := sum(value), .(date, A)]

Or with base R

df$sum <- with(df, ave(value, A, date, FUN = sum))

answered Nov 27 '19 at 20:57

akrun

874,273
37
540
662

score 2 · Accepted Answer · answered Nov 27 '19 at 20:57

2

A base R solution:

aggregate(value ~ A + date, data=df, sum)
  A date value
1 1 1999    40
2 2 2000    20
3 2 2001    50
4 4 2001    40

answered Nov 27 '19 at 20:57

G5W

36,531
10
47
80

How to do this question in R without using loop?

3 Answers3