3

I have a column which I wish to transform it to seconds. The transformation function works, but when I try to use mutate_at to iterate multiple columns. It isn't working as I expect to. I don't know what I am missing in the mutate_at syntax.

I have this:

catalog
# A tibble: 4 x 3
#  file                              start end  
#  <chr>                             <chr> <chr>
#1 20190506_205959-20190506_210459   1:58  3:00 
#2 20190506_210507-20190506_211007   0     0:32 
#3 20190506_205959-20190506_210459_2 0     3:18 
#4 20190506_220712-20190506_221210   0     5  

transform_time_to_seconds <- function(x) {
    x %>% 
        str_split(":", simplify = TRUE) %>% 
        as.numeric() %>% 
        {.[1] * 60 + 
         ifelse(is.na(.[2]), 0, .[2])}
}

I apply the mutate_at:

catalog %>%
    mutate_at(vars(start, end), transform_time_to_seconds)
# A tibble: 4 x 3
#  file                              start   end
#  <chr>                             <dbl> <dbl>
#1 20190506_205959-20190506_210459      60   180
#2 20190506_210507-20190506_211007      60   180
#3 20190506_205959-20190506_210459_2    60   180
#4 20190506_220712-20190506_221210      60   180

But what I expect is this:

catalog %>%
    mutate(start = map_dbl(start, transform_time_to_seconds),
           end   = map_dbl(end, transform_time_to_seconds))
# A tibble: 4 x 3
#  file                              start   end
#  <chr>                             <dbl> <dbl>
#1 20190506_205959-20190506_210459     118   180
#2 20190506_210507-20190506_211007       0    32
#3 20190506_205959-20190506_210459_2     0   198
#4 20190506_220712-20190506_221210       0   300

Any suggestions ?


catalog data:

structure(list(file = c("20190506_205959-20190506_210459", "20190506_210507-20190506_211007", 
"20190506_205959-20190506_210459_2", "20190506_220712-20190506_221210"
), start = c("1:58", "0", "0", "0"), end = c("3:00", "0:32", 
"3:18", "5")), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), spec = structure(list(cols = list(
    file = structure(list(), class = c("collector_character", 
    "collector")), start = structure(list(), class = c("collector_character", 
    "collector")), end = structure(list(), class = c("collector_character", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1), class = "col_spec"))
`` 
Rafael Toledo
  • 974
  • 13
  • 19
  • 1
    It seems that the issue is in working from a matrix, as is returned by your `str_split`. See this by adding `rowwise` before `mutate_at`: that gets you your expected output. I *think* it's coming from the fact that `mutate_*` functions expect to work along an entire vector at once, which is thrown off by the matrix. – camille May 07 '19 at 14:20
  • 4
    Duplicate of [Applying a function to every row of a table using dplyr?](https://stackoverflow.com/questions/21818181/applying-a-function-to-every-row-of-a-table-using-dplyr) – M-- May 07 '19 at 14:23
  • ```catalog %>% group_by(1:n()) %>% mutate_at(vars(start, end), transform_time_to_seconds) %>% select(-4)``` – M-- May 07 '19 at 14:31

2 Answers2

5

You could also vectorize your function

transform_time_to_seconds <- Vectorize(transform_time_to_seconds)
Daniel
  • 2,207
  • 1
  • 11
  • 15
2

Your function expects one value at a time whereas you are passing an entire column.

Adding rowwise might help

library(dplyr)

catalog %>%
  rowwise() %>%
  mutate_at(vars(start, end), transform_time_to_seconds)

# A tibble: 4 x 3
#  file                              start   end
#  <chr>                             <dbl> <dbl>
#1 20190506_205959-20190506_210459     118   180
#2 20190506_210507-20190506_211007       0    32
#3 20190506_205959-20190506_210459_2     0   198
#4 20190506_220712-20190506_221210       0   300
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213