1

I'm working in the tidyverse and I have a four column tibble, which looks like this:

+-------------------+--------------------+--------------------+-------------+
| application <chr> |      start<chr>    |       end<chr>     |  usage<chr> |
+-------------------+--------------------+--------------------+-------------+
| reddit is fun     | 01-Mar-19 17:37:26 | 01-Mar-19 17:37:36 | 10 sec      |
| Maps              | 01-Mar-19 17:37:38 | 01-Mar-19 17:41:1  | 3 min       |
| Clock             | 01-Mar-19 17:41:10 | 01-Mar-19 17:41:21 | 11 sec      |
+-------------------+--------------------+--------------------+-------------+

My intent is to convert any value listed in minutes in the usage column, to seconds. I'm able to do this in excel, though I'd rather keep excel out of the equation!

My thought is that I may need to use the extract function, using regex for the "# min" values, separate the "#" and "min", do the conversion to seconds and then combine the two columns, both now in seconds, with paste.

Am I on the right track? Thank you!

Vaulto
  • 17
  • 8
  • 1
    Hi, and welcome to SO! Could you provide your table in a reproducible format, perhaps by using `dput`? See [how to make a reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Calum You Nov 12 '19 at 02:59
  • Additionally, it may make more sense to use the `start` and `end` columns to get the usage time instead of parsing the `usage` column, since that will round all values given in minutes. – Calum You Nov 12 '19 at 03:00
  • Here is the dput output structure(list(application = c("Clock", "reddit is fun", "reddit is fun", "Clock"), start = c("01-Mar-19 7:30:03", "01-Mar-19 7:30:11", "01-Mar-19 7:35:01", "01-Mar-19 7:35:03"), end = c("01-Mar-19 7:30:11", "01-Mar-19 7:30:23", "01-Mar-19 7:35:03", "01-Mar-19 7:35:08" ), usage = c("8 sec", "12 sec", "2 sec", "5 sec")), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame")) – Vaulto Nov 12 '19 at 03:09
  • Thank you for your insight and for the welcoming! – Vaulto Nov 12 '19 at 03:29

2 Answers2

1

If you want to convert the usage column to seconds we can use grepl to find out values which has "min" in it and multiply them with 60.

df$seconds <- with(df, ifelse(grepl('min', usage), 
       as.integer(gsub('\\D', '', usage)) * 60, as.integer(gsub('\\D', '', usage))))

df
#    application              start                end  usage seconds
#1 reddit is fun 01-Mar-19 17:37:26 01-Mar-19 17:37:36 10 sec      10
#2          Maps 01-Mar-19 17:37:38  01-Mar-19 17:41:1  3 min     180
#3         Clock 01-Mar-19 17:41:10 01-Mar-19 17:41:21 11 sec      11

However, I agree with @Calum You to use start and end columns to get usage time in seconds

library(dplyr)
library(lubridate)

df %>%
  mutate_at(vars(start, end), dmy_hms) %>%
  mutate(seconds = as.integer(end - start))

data

df <- structure(list(application = structure(3:1, .Label = c("Clock", 
"Maps", "reddit is fun"), class = "factor"), start = structure(1:3, 
.Label = c("01-Mar-19 17:37:26", "01-Mar-19 17:37:38", "01-Mar-19 17:41:10"), 
class = "factor"), end = structure(1:3, .Label = c("01-Mar-19 17:37:36", 
"01-Mar-19 17:41:1", "01-Mar-19 17:41:21"), class = "factor"), usage =
structure(c(1L,3L, 2L), .Label = c("10 sec", "11 sec", "3 min"), 
class = "factor")), row.names = c(NA, -3L), class = "data.frame")
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I already had lubridate and dplyr libraries loaded, convenient! Using start and end worked beautifully! Thank you! I'm new to R, this solution is simple and wonderful. – Vaulto Nov 12 '19 at 03:16
1

Hi and welcome to SO !

To provide an alternative (maybe not the best) to the answer proposed by Ronak, you can do:

sapply(usage, function(x){
  if(length(x[grep("min",x)]) != 0) 
    {
    x[grep("min",x)] = as.character(paste0(as.numeric(gsub(" min","",x))*60," sec"))
  }
  else{x = x}
})

With your example, it will give the following output:

usage = c("10 sec","3 min","11 sec")

> sapply(usage, function(x){
+   if(length(x[grep("min",x)]) != 0) 
+     { x[grep("min",x)] = as.character(paste0(as.numeric(gsub(" min","",x))*60," sec"))}
+   else{x = x}
+ })
   10 sec     3 min    11 sec 
 "10 sec" "180 sec"  "11 sec" 

Then, you can use this output to replace your usage column.

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Thank you dc37! I'm unfamiliar with sapply, I'll put it into my research list. – Vaulto Nov 12 '19 at 03:17
  • You're welcome ! It's mainly because I'm not that familiar with `dplyr` that I have to find tricks to do almost the same thing. – dc37 Nov 12 '19 at 03:18