0

I'm playing around with functions in R and want to create a function that takes a character variable and converts it to a POSIXct.

The time variable currently looks like this:

"2020-01-01T05:00:00.283236Z"

I've successfully converted the time variable in my janviews dataset with the following code:

janviews$time <- gsub('T',' ',janviews$time)
janviews$time <- as.POSIXct(janviews$time, format = "%Y-%m-%d %H:%M:%S", tz = Sys.timezone())

Since I have to perform this on multiple datasets, I want to create a function that will perform this. I created the following function but it doesn't seem to be working and I'm not sure why:

set.time <- function(dat, variable.name){
  dat$variable.name <- gsub('T', ' ', dat$variable.name)
  dat$variable.name <- as.POSIXct(dat$variable.name, format = "%Y-%m-%d %H:%M:%S", tz = Sys.timezone())
}

Here's the first four rows of the janviews dataset:

structure(list(customer_id = c("S4PpjV8AgTBx", "p5bpA9itlILN", 
"nujcp24ULuxD", "cFV46KwexXoE"), product_id = c("kq4dNGB9NzwbwmiE", 
"FQjLaJ4B76h0l1dM", "pCl1B4XF0iRBUuGt", "e5DN2VOdpiH1Cqg3"), 
    time = c("2020-01-01T05:00:00.283236Z", "2020-01-01T05:00:00.895876Z", 
    "2020-01-01T05:00:01.362329Z", "2020-01-01T05:00:01.873054Z"
    )), row.names = c(NA, -4L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x1488180e0>)

Also, if there is a better way to convert my time variable, I am open to changing my method!

amatof
  • 175
  • 1
  • 13
  • `as.POSIXct("2020-01-01T05:00:00.283236Z", format = "%Y-%m-%dT%H:%M:%OSZ", tz = "UTC")` works, there should be no need for `gsub`. – r2evans Sep 22 '22 at 17:19
  • So perhaps just `janviews[, time := as.POSIXct(time, format = "%Y-%m-%dT%H:%M:%OSZ", tz="UTC")]` (adjusting `tz=` as desired). – r2evans Sep 22 '22 at 17:20
  • If you need to adjust the timezone, then you should _keep_ `tz="UTC"` for parsing it (because of the trailing `Z`), and then subsequently change the timezone with `[, time := \`attr<-\`(time, "tzone", Sys.timezone())]`. (This is all assuming `data.table`, based on the `.internal.selfref` in your sample data.) – r2evans Sep 22 '22 at 17:22
  • @r2evans, that works, thank you! out of curiosity, do you happen to know why the function wasn't working? – amatof Sep 22 '22 at 17:33
  • I have a strong idea, though I don't know how you're calling `set.time` to produce a failure: if you call it as `set.time(janviews, "time")`, then `dat$variable.name` is looking for a column named `"variable.name"`; you would need `gsub("T", " ", dat[[variable.name]])` instead. (`dat$variable.name` should be returning `NULL` internally. It always help to `debug(set.time)` and test it in real-time.) – r2evans Sep 22 '22 at 17:36
  • See https://stackoverflow.com/q/1169456/3358272 for more about the difference between `[`, `[[`, and `$`. (It would seem that this question is really an [XY Problem](https://meta.stackexchange.com/a/66378/300391), where you thought the string wasn't being parsed correctly when in fact it was that no string was being supplied at all, just `NULL`.) – r2evans Sep 22 '22 at 17:38

1 Answers1

0

I would use the lubridate package and the as_datetime() function.

lubridate::as_datetime("2020-01-01T05:00:00.283236Z")

Returns "2020-01-01 05:00:00 UTC"

Lubridate Info

Dphill
  • 1
  • 1