1

I have a dataframe where I have a column which has value of runtime as "70 min" or "1 hr 30 Min" and N/A etc. I want to convert these values to numeric, like 70 min should be 70 and 1 hr 30 minutes should be 90. Also, I want to Keep N/A as it is.

a<- c("70 min", "1 hr 30 Min")
typeof(a)

a <- as.numeric(a)

when I tried as.numeric, it converted everything to NA, some experiments with lubridate package also disappointed me. Any expert advice please.

1 Answers1

3

The duplicate link did not look particularly appetizing to me, so I will offer the following regex based solution. Assuming your non standard timestamp be in a fixed and known format, we can use a regex to extract out the various portions. Under the assumption that you only have hour and minute information, you can try:

a <- c("70 min", "1 hr 30 Min", "Blah")
hrs <- as.numeric(gsub(".*?(\\d+) [Hh]rs?.*", "\\1", a))
hrs[is.na(hrs)] <- 0
min <- as.numeric(gsub(".*?(\\d+) [Mm]in.*", "\\1", a))
min[is.na(min)] <- 0

total <- hrs*60 + min

Output:

> min
[1]  0 30  0
> hrs
[1] 0 1 0
> total
[1]  0 90  0
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Hi Tim, yes, my data has three formats, first is like 1 h 30 min, second is like 40 min and third is N/A. – Ravindra Kumar Mar 03 '17 at 05:06
  • But your solution is returning Warning message: NAs introduced by coercion and it converted everything to NA_real_ – Ravindra Kumar Mar 03 '17 at 05:07
  • Use `is.na()` to replace NA values with zero. Ideally, each of your non standard timestamps would follow its own single format. – Tim Biegeleisen Mar 03 '17 at 05:24
  • Time <- c("70 min", "1 h 30 min") Change_in_mimnutes <- function(hms_char) { # split string v <- strsplit(hms_char, " ")[[1]] # get numbers idx <- seq(1, by = 2, length = length(v)/2) nums <- as.list(v[idx]) # get units and use them as names names(nums) <- v[-idx] # apply functions, sum and convert to days duration <- do.call(period, nums) minutes <- period_to_seconds(duration)/60 return(minutes) } sapply(Time, Change_in_mimnutes) – Ravindra Kumar Mar 03 '17 at 06:08
  • @RavindraKumar I have updated my answer to handle the NA problem. – Tim Biegeleisen Mar 03 '17 at 06:34
  • Thanks a lot Tim :) – Ravindra Kumar Mar 03 '17 at 13:59