0

i am trying to convert a data.frame with the amount of time in the format hours:minutes.

i found this post useful and like the simple code approach of using the POSIXlt field type.

R: Convert hours:minutes:seconds

However each column represents a month's worth of days. columns are thus uneven. When i try the code below following several other SO posts, i get zeros in the one column with fewer row values.

The code is below. Note that when run, you get all zeros for feb which has fewer data values in its rows.

rDF <- data.frame(jan=c("9:59","10:02","10:04"),
              feb=c("9:59","10:02",""),
              mar=c("9:59","10:02","10:04"),stringsAsFactors = FALSE)

for (i in 1:3) {
  Res <- as.POSIXlt(paste(Sys.Date(), rDF[,i]))
  rDF[,i] <- Res$hour + Res$min/60
}

Thank you for any suggestions to fix this issue. I'm open to a more efficient approach as well. Best, Leah

Community
  • 1
  • 1
Leah Wasser
  • 717
  • 3
  • 8
  • 22
  • Please notice that it's considered good practice to use `NA` instead of `""` for your feb missing value (for _any_ missing value!) – PavoDive Oct 10 '15 at 03:04
  • hi there @PavoDive . thank you. The data were downloaded with blanks. So you are suggesting that I add NA to create the final output? i will covert to CSV - will the NA result in the columns being converted to string when imported from CSV? Thank you. – Leah Wasser Oct 12 '15 at 14:41
  • I think it's a good idea to properly identify your not available data with `NA` as part of your getting and/or cleaning of the data. This way, you'll avoid a lot of otherwise unidentifiable errors. – PavoDive Oct 12 '15 at 15:43
  • THANK YOU! i will do that. much appreciated. – Leah Wasser Oct 13 '15 at 02:15

2 Answers2

2

You could try using the package lubridate. Here we are converting your data row by row to hour-minute format (using hm), then extracting the hours, and adding the minutes divided by 60:

library(lubridate)

rDF[] <- lapply(rDF, function(x){hm(x)$hour + hm(x)$minute/60})

        jan       feb       mar
1  9.983333  9.983333  9.983333
2 10.033333 10.033333 10.033333
3 10.066667        NA 10.066667
jeremycg
  • 24,657
  • 5
  • 63
  • 74
1

This could easily be achieved with package lubridate's hm:

library(lubridate)
temp<-lapply(rDF,hm)
NewDF<-data.frame(jan=temp[[1]],feb=temp[[2]],mar=temp[[3]])
PavoDive
  • 6,322
  • 2
  • 29
  • 55