0

I have a data table 'df' with 3 columns. id , meal , time Each id has many rows.

library(data.table)
id = c(1,1,2,2,3,3)
meal = c(1,1,0,0,1,0)
time = c(10,9,12,13,7,15)
df <- data.table(id, meal, time)

> df
   id meal time
1:  1    1   10
2:  1    1    9
3:  2    0   12
4:  2    0   13
5:  3    1    7
6:  3    0   15

Now, I want to calculate the sum of time when meal == 1 of each ID. My current code is:

df[meal == 1, sum(time), by = "id"]

However, this code excute meal == 1 first, so when some id have no record with meal == 1, it would be omit instead of return 0.

id 2 is omitted here.

   id V1
1:  1 19
2:  3  7

What can I do?

Harold
  • 373
  • 2
  • 12
  • 2
    Including a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) in your question will increase your chances of getting an answer. – Samuel Oct 13 '17 at 22:36
  • You can do a merge after calculating sums. https://stackoverflow.com/questions/18866796/empty-factors-in-by-data-table – Roman Luštrik Oct 13 '17 at 22:39
  • you can use `ifelse` – Bulat Oct 13 '17 at 22:40

2 Answers2

1

You can try something like this:

df[, is.meal.one := as.integer(meal == 1)]
df[, sum(time * is.meal.one), by = "id"]
Bulat
  • 6,869
  • 1
  • 29
  • 52
0

How about this?

library(data.table)

dt <- data.table(
  id = c(1, 1, 2, 2, 3, 3),
  meal = c(1, 5, 3, 2, 10, 1),
  time = c(2, 10, 15, 5, 5, 2)
)

   id meal time
1:  1    1    2
2:  1    5   10
3:  2    3   15
4:  2    2    5
5:  3   10    5
6:  3    1    2

dt[, list(Meal1Time = sum(time[which(meal == 1)])), by=id]
   id Meal1Time
1:  1         2
2:  2         0
3:  3         2

Note that this is not the most efficient method, but unless your working with millions of rows of data, it should run almost instantaneously.

Ben
  • 20,038
  • 30
  • 112
  • 189
  • Please don't forget to mark the solution as "correct" if it's the one you want. – Ben Oct 13 '17 at 23:02