0

I would like to do a sommation of durations in R in order to get a total one.

My data frame (simplified) looks like this:

Name Job # Job duration
John 1 10
John 2 5
Mary 1 30
Bob 1 7
Bob 2 7
Bob 3 7

So, in this case, I would like to create a new variable like data$job_duration_total. For this variable, John would have a job_duration_total = 15, Mary = 30 and Bob = 21

I tried using the sum function, without success.

Any help you can provide in helping me understand is greatly appreciated!

  • If you want the number of rows conserved then ```data$job_duration_total <- ave(data$`Job duration`, data$Name, FUN = sum)```. If you want a summary with one row per name, then ```aggregate(`Job duration` ~ name, data = data, FUN = sum)``` should do it. Though you may find `dplyr` easier. ```library(dplyr); data %>% group_by(Name) %>% mutate(job_duration_total = sum(`Job duration`))``` to preserve rows or ```library(dplyr); data %>% group_by(Name) %>% summarise(job_duration_total = sum(`Job duration`))``` for a summary data frame. – Allan Cameron Aug 10 '23 at 18:07

1 Answers1

0

Here is a method to also preserve the order of the Name:

library(tidyverse) 
data.frame(Name = c("John", "John", "Mary", "Bob", "Bob", "Bob"),
           Job = c(1, 2, 1, 1, 2, 3),
           Duration = c(10, 5, 30, 7, 7, 7)) %>% 
  
  # solution start here
  mutate(Name = as_factor(Name)) %>% # can skip this line if the order is not important
  group_by(Name) %>% 
  summarize(sum(Duration))

Next time, if the question gets complicated, it will be helpful to also provide a sample dataset.

William Wong
  • 453
  • 2
  • 9