I have a data.table that lists the user id, the week number, the fact that a user did something (Processed, either 0 or 1) and a column I just use to count how many values I have, called HowMany:
data <- data.table(WeekNumber=c(33,33,33,34,34,33,33,34,34),
User=c(1,1,1,1,1,2,2,2,2),
Processed=c(1,1,0,0,1,0,1,0,1),
HowMany=c(1,1,1,1,1,1,1,1,1))
I want to find, for each week, the sum of things done and not done, so I do something like this:
> dcast(setDT(data), WeekNumber~Processed, value.var="HowMany", sum)
WeekNumber 0 1
1: 33 2 3
2: 34 2 2
Now I'd like to find the average number of things done and not done by week, so in this case I have to somewhat aggregate also by user before, but I fail at this step:
> dcast(setDT(data), WeekNumber~Processed+User, value.var="HowMany", mean)
WeekNumber 0_1 0_2 1_1 1_2
1: 33 1 1 1 1
2: 34 1 1 1 1
while my optimal results would be:
WeekNumber 0 1
33 1 1.5
34 1 1