2

Possible Duplicate:
How R formats POSIXct with fractional seconds

I'm familiar with this question about "How R formats POSIXct with fractional seconds". An argument follows there, regarding whether POSIXct has numeric errors or not when dealing with micro-seconds.

Before I re-implement a whole set of xts functionalities that can deal without errors with micro-seconds (nothing wrong with xts - just that it required POSIXct), I just wanted to make sure:

Why is the output of the following line is 4.577894?

as.POSIXlt(as.POSIXct(sprintf("%s",(format(as.POSIXct("2012-12-14 15:42:04.577895 EDT"), "%Y-%m-%d %H:%M:%OS6")))))$sec

Thanks a lot!

EDIT

The rational behind this is the following: if I'm reading a time entry from a file, doing some processing, writing to file again, reading again etc., I get accumulated errors. So - this is not a 'trick' question, but actually comes after hours of debugging..

Community
  • 1
  • 1
zuuz
  • 859
  • 1
  • 12
  • 23
  • 1
    The output of your line of code is `4`. Did you mean `as.POSIXlt(as.POSIXct(sprintf("%s",(format(as.POSIXct("2012-12-14 15:42:04.577895 EDT"), "%Y-%m-%d %H:%M:%OS6")))))`? – plannapus Feb 04 '13 at 12:03
  • @plannapus: to underline the problem, I explicitly look at the sec field of the POSIXlt object, which gives the fractional seconds. When I run this line as-is I still get 4.577894.... some answers about printing fractional seconds say the problem is with printing, and not with the actual value. Here the problem is with the actual value. – zuuz Feb 04 '13 at 12:11
  • `as.POSIXct("2012-12-14 15:42:04.577895 EDT")` returns `"2012-12-14 15:42:04 CET"` that you then input directly to `sprintf`: at this point you already lost your microseconds info. Anything you can do after that will only gives you `4` as the seconds and not `4.577894` nor `4.577895`. Hence my first comment, since your question only makes sense if you input a character string that show the microseconds in `sprintf`. I understand perfectly that your desired final object has to be a `POSIXlt` object. – plannapus Feb 04 '13 at 12:18
  • My first comment wasn't meant to be an answer, I just wanted to point out that you probably made a mistake when writing your question. – plannapus Feb 04 '13 at 12:25
  • @plannapus: perhaps I have a different setup here: did you run my command line? did 4 is what you actually get? if so, I'll edit my question to use your version – zuuz Feb 04 '13 at 12:28
  • Yes I ran the command line in your question and it outputs `4`. – plannapus Feb 04 '13 at 12:29
  • If there is no typo in the command line you gave, then what is the meaning of the double set of brackets around `as.POSIXct("2012-12-14 15:42:04.577895 EDT")`? – plannapus Feb 04 '13 at 12:32
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/23903/discussion-between-zorbar-and-plannapus) – zuuz Feb 04 '13 at 13:10
  • For more weirdness: The conversion in the year 1972 is OK (presumably because the number of seconds is lower and therefore allows more precision in the mantissa). For years 2038 and upwards, an extra 2 seconds appears. – James Feb 04 '13 at 14:07
  • The answer to this question is appropriate here: http://stackoverflow.com/questions/7726034/how-r-formats-posixct-with-fractional-seconds. I see that the OP links to this question. Why ignore the answer? – Matthew Lundberg Feb 04 '13 at 19:13

2 Answers2

3

This question has gone on far too long -- especially considering that the question itself links to the answer.

This is an output representation problem, and is well-known. Rounding time values has implications. You cannot just round the subsecond values without getting incorrect results in some cases.

The solution is here (as linked in the question): How R formats POSIXct with fractional seconds

Stealing directly from Aaron's excellent answer:

myformat.POSIXct <- function(x, digits=0) {
  x2 <- round(unclass(x), digits)
  attributes(x2) <- attributes(x)
  x <- as.POSIXlt(x2)
  x$sec <- round(x$sec, digits)
  format.POSIXlt(x, paste("%Y-%m-%d %H:%M:%OS",digits,sep=""))
}

x <- as.POSIXct("2012-12-14 15:42:04.577895 EDT")

What you are attempting:

as.POSIXlt(as.POSIXct(sprintf("%s",(format(x, "%Y-%m-%d %H:%M:%OS6")))))$sec
## [1] 4.577894

Aaron's code:

myformat.POSIXct(x, 6)
## [1] "2012-12-14 15:42:04.577895"

One might think to use sprintf with a format string of %.06f to format the subsecond value, but this will fail if the value is slightly below an integer:

sprintf('%.06f', .9999999)
## [1] "1.000000"

Converting to a POSIXlt causes the proper rollover into seconds, minutes, etc., if the subsecond value is rounded up.

To show you that this is not a data problem:

y <- as.numeric(x)
y
## [1] 1355521324.5778949261
sprintf('%06f', y - floor(y))
## [1] "0.577895"
Community
  • 1
  • 1
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
  • I linked to what seemed to be an argument, not an answer. I wasn't looking for who's right; just for a solution. So, thanks a lot for re-writing it into the **myformat.POSIXct** function, and providing a good example. Cheers! – zuuz Feb 06 '13 at 07:21
  • @Mathhew: but still, you added another answer to the question I linked to in the top. So - is the solution you just suggested correct or not? – zuuz Feb 06 '13 at 07:43
  • @zorbar I did add another answer there instead of here, but that is because the other question is clearer (less localized). And indeed, the behavior that you see is important for me as I work with subsecond time series very often. The solution with `myformat.POSIXct` seems to work fine for dates prior to 2038 (and is much simpler than my answer to the othe question). The real answer is to find the problem in the code. I downloaded the source for investigation. – Matthew Lundberg Feb 06 '13 at 14:07
  • @zorbar I found the bug in the source code. It does not affect dates between 1970 and 2038, so you're OK with the implementation of `myformat.POSIXct` until 2038, and I'm sure this bug will be fixed long before then (I'm looking into a patch, barring that, a bug report will be filed). – Matthew Lundberg Feb 07 '13 at 03:28
0

I think the answer to your problem is noted in the comments of the SO question you referenced. The problem is that the conversion to POSIXct or POSIXlt contains rounding. Simply add a small offset to your value, and the conversion rights itself.

# Original value
format(as.POSIXct("2012-12-14 15:42:04.577895 EDT"), "%Y-%m-%d %H:%M:%OS6")
[1] "2012-12-14 15:42:04.577894"

# Original value + offset
format(as.POSIXlt("2012-12-14 15:42:04.5778951 EDT"), "%Y-%m-%d %H:%M:%OS6")
[1] "2012-12-14 15:42:04.577895"

I recommend using a regular expression to add the offset, like so:

gsub(" (\\w+)$","1 \\1","2012-12-14 15:42:04.577895 EDT")
[1] "2012-12-14 15:42:04.5778951 EDT"
Dinre
  • 4,196
  • 17
  • 26
  • Won't that cause errors for other values? – Matthew Lundberg Feb 04 '13 at 13:26
  • I did a little checking before I uploaded the answer, and there appear to be no other rounding errors. The extra '1' is not enough to cause an unwanted round up, but it is enough to prevent an unwanted round down. It's then stripped out immediately after this statement, since we have the '%OS6' flag set. Of course, I'm assuming the value is using all 6 digits before adding the '1'. – Dinre Feb 04 '13 at 13:30
  • Interestingly enough, you can still see rounding with the conversion to numeric. For instance, `as.numeric(as.POSIXct("2012-12-14 15:42:04.5778951 EDT"))%%1` produces '0.5778952' instead of '0.5778951'. Still, it doesn't really matter where the rounding occurs as long as you get accurate I/O from your function in the end. In this specific case, adding an extra '1' seems to correct the function. – Dinre Feb 04 '13 at 14:03
  • It will go horribly wrong if the fractional part is missing. It will produce wrong results if time deltas are taken (rather than just carrying a value through from beginning to end). Besides, there are better ways to correct the output (and this is an output representation problem). For example: http://stackoverflow.com/questions/7726034/how-r-formats-posixct-with-fractional-seconds – Matthew Lundberg Feb 04 '13 at 19:11
  • @MatthewLundberg: an output representation problem, meaning that it's the call to 'sprintf' that ruins me? so what is the method to solve this? how do I print the time to file, so that I read it correctly later? – zuuz Feb 05 '13 at 10:14
  • @zorbar See the answer to the question that you linked (I linked to it also). There is a function called `myformat.POSIXct` defined there. It does exactly what you need. – Matthew Lundberg Feb 05 '13 at 14:25