0

Here is an example df:

test <- data.frame(employee = rep(c("John", "Brent", "Mike", "Sophia", "Michelle"), 20),
               location = rep(c("home", "away"), 50),
               weekday = sample(1:5, 100, replace = TRUE),
               time1 = sample(20:40, 100, replace = TRUE),
               time2 = sample(10:30, 100, replace = TRUE))

For each unique combination of "employee", "location", and "weekday", I'd like to run a function on the two time variables like the example below (my real function is much more complex):

function(time1, time2) {
  sum(time1) - sum(time2) * mean(time)
}

Additionally, I'd like the results to be returned in a single data frame into something like this (in my real problem, not every "employee" will have data on every "weekday"):

   weekday location employee function_output
1        1     away    Brent                
2        2     away    Brent                
3        3     away    Brent                
4        4     away    Brent                
5        5     away    Brent                
6        1     home    Brent                
7        2     home    Brent                
8        3     home    Brent                
9        4     home    Brent                
10       5     home    Brent                
11       1     away     John                
12       2     away     John                
13       3     away     John                
14       4     away     John                
15       5     away     John                
16       1     home     John                
17       2     home     John                
18       4     home     John                
19       5     home     John                
20       1     away Michelle                
21       2     away Michelle                
22       3     away Michelle                
23       5     away Michelle                
24       1     home Michelle                
25       2     home Michelle                
26       3     home Michelle                
27       4     home Michelle                
28       5     home Michelle                
29       1     away     Mike                
30       2     away     Mike                
31       3     away     Mike                
32       4     away     Mike                
33       5     away     Mike                
34       1     home     Mike                
35       3     home     Mike                
36       4     home     Mike                
37       5     home     Mike                
38       1     away   Sophia                
39       4     away   Sophia                
40       5     away   Sophia                
41       1     home   Sophia                
42       2     home   Sophia                
43       3     home   Sophia                
44       4     home   Sophia                
45       5     home   Sophia 

I'd be OK using apply if I only had a single variable, but I can't figure out how to do this with more than one variable to be subset on.

Thanks in advance.

bshelt141
  • 1,183
  • 15
  • 31
  • Replace `sum` with `your_function` in any answer in the duplicate, e.g., with `dplyr`: `group_by(test, employee, location, weekday) %>% mutate(output = your_function(time1, time2))`. – Gregor Thomas Dec 04 '15 at 23:38
  • Use `summarize` instead of `mutate` if you want the result set collapsed to each unique employee, location, weekday combination. – Gregor Thomas Dec 04 '15 at 23:40

0 Answers0