10

This should be quick - we are parsing the following format in R:

2013-04-05T07:49:54-07:00

My current approach is

require(stringr) 
timenoT <- str_replace_all("2013-04-05T07:49:54-07:00", "T", " ") 
timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S%z", tz="UTC")

but it gives NA.

nneonneo
  • 171,345
  • 36
  • 312
  • 383
Rico
  • 1,998
  • 3
  • 24
  • 46

3 Answers3

17

%z is the signed offset in hours, in the format hhmm, not hh:mm. Here's one way to remove the last :.

newstring <- gsub("(.*).(..)$","\\1\\2","2013-04-05T07:49:54-07:00")
(timep <- strptime(newstring, "%Y-%m-%dT%H:%M:%S%z", tz="UTC"))
# [1] "2013-04-05 14:49:54 UTC"

Also note that you don't have to remove the "T".

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • 1
    That seems like an overly general pattern. I would think something like this would be less prone to over-correct: `gsub("([+-]\\d\\d)(:)", "\\1", argvec)` – IRTFM Apr 05 '13 at 16:45
  • I need to do exactly the opposite! How can I add `:` in the last part? For me, it always shows `+0200`, but I need `+02:00`. – PM0087 Apr 23 '20 at 10:02
  • 1
    @PeyM87: something like this should be close: `gsub("(.*)([+-][[:digit:]]{2})([[:digit:]]{2})$", "\\1\\2:\\3", x)` – Joshua Ulrich Apr 23 '20 at 10:39
1

You don't the string replacement.

NA just means that the whole did not work, so do it pieces to build your expression:

R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%d") 
[1] "2013-04-05"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M") 
[1] "2013-04-05 07:49:00"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M:%S")
[1] "2013-04-05 07:49:54" 
R>

Also, for reasons I never fully understood -- but which probably reside with C library function underlying it, %z only works on output, not input. So your NA mostly likely comes from your use of %z.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
0

strptime("2013-04-05 07:49:54-07:00", "%Y-%m-%d %H:%M:%S", tz="UTC") gives 2013-04-05 07:49:54 UTC

Try

timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S", tz="UTC")
rongenre
  • 1,334
  • 11
  • 21
  • 2
    That's incorrect, because the time isn't 7:49:54 UTC, it's 7 hours behind UTC. So if you express it _in_ UTC, it will be 7 hours ahead of 7:49.54. – Joshua Ulrich Apr 05 '13 at 16:23