1

I find some strange behaviour from as.POSIXlt that I am unable to explain, I am hoping someone else can. In investigating this question I found that sometimes the fractional part of a second would be rounded incorrectly

For example, the numbers below represent a particular second since the epoch has begun, with the last 6 digits being the fractional part of the second, so the fraction of a second on the first number should be .645990.

# Generate sequence of integers to represent date/times
times <- seq( 1366039619645990 , length.out = 11 )
options(scipen=20)
times
 [1] 1366039619645990 1366039619645991 1366039619645992 1366039619645993 1366039619645994 1366039619645995
 [7] 1366039619645996 1366039619645997 1366039619645998 1366039619645999 1366039619646000

# Convert to date/time with microseconds 
options(digits.secs = 6 )
as.POSIXlt( times/1e6, tz="EST", origin="1970-01-01") + 5e-7
 [1] "2013-04-15 10:26:59.645990 EST" "2013-04-15 10:26:59.645991 EST" "2013-04-15 10:26:59.645992 EST"
 [4] "2013-04-15 10:26:59.645993 EST" "2013-04-15 10:26:59.645994 EST" "2013-04-15 10:26:59.645995 EST"
 [7] "2013-04-15 10:26:59.645996 EST" "2013-04-15 10:26:59.645997 EST" "2013-04-15 10:26:59.645998 EST"
[10] "2013-04-15 10:26:59.645999 EST" "2013-04-15 10:26:59.646000 EST"

I found that I have to add a small increment, equal to half the minimum change in time to get correct representation of the fractional part of a second, otherwise rounding errors occur. And it works just fine if I run as.POSIXlt on a sequence of numbers as above, however if I try to convert one number, namely the one that should end in .645999 then the number of truncated to .645 and I do not know why!

# Now just convert the date/time that should end in .645999
as.POSIXlt( times[10]/1e6, tz="EST", origin="1970-01-01") + 5e-7
[1] "2013-04-15 10:26:59.645 EST"

Compare the 10th element in the vector returned by as.POSIXlt with the single element equivalent above. What is happening?

Session info:

R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] raster_2.0-41 sp_1.0-5     

loaded via a namespace (and not attached):
[1] grid_2.15.2     lattice_0.20-13 tools_2.15.2
Community
  • 1
  • 1
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
  • On my machine it drops all the decimals of `times` when doing `times/1e6`. In fact I get `2013-04-15 10:26:59 EST` 11 times... – Michele Apr 24 '13 at 09:23
  • @Michele thanks! - you have shown me I am missing one vital piece of code in my example - adding now!! – Simon O'Hanlon Apr 24 '13 at 09:24
  • @Michele please can you try again? Run `options( digits.secs = 6 )` first. – Simon O'Hanlon Apr 24 '13 at 09:25
  • This is a formatting issue, most thoroughly discussed [here](http://stackoverflow.com/q/7726034/271616). – Joshua Ulrich Apr 24 '13 at 10:41
  • @JoshuaUlrich thanks for the link - extremely informative. Shall I close this as a dupe then? I was looking at POSIXlt which is why I didn't see this title with POSIXct – Simon O'Hanlon Apr 24 '13 at 10:45
  • @JoshuaUlrich I think that *this* is a rounding issue. `format.POSIXlt` doesn't work as expected because of inaccuracy in the way the threshold for rounding a number calcualted. If we run the command `sprintf( "%.20f" , abs(secs[2] - round(secs,5)))` it gives `[1] "0.00000099999999991773"` which is < 0.000001 (see in answer below for how that subsequently affects printing). Does that make sense? – Simon O'Hanlon Apr 24 '13 at 12:54
  • @SimonO101: there's no "threshold" for rounding a number. What you're seeing is the effect of floating point precision: `sprintf("%22.20f",secs)`. – Joshua Ulrich Apr 24 '13 at 17:41
  • @JoshuaUlrich I know - see my answer below! But there is a *threshold* (I do't explain myself very well I admit) - in format.POSIXlt , a number is rounded to a maximum of 6 digits, no matter how many decimal places you give it to (also see below) – Simon O'Hanlon Apr 24 '13 at 17:42
  • @SimonO101: Oh, I see; you're referring to the threshold in `format.POSIXlt`. That's what I meant by it being a formatting issue. The actual underlying number is correct, it just does not print correctly. – Joshua Ulrich Apr 24 '13 at 17:53

2 Answers2

2

This seems to be a rounding issues, whereby significant digits of the fractional second are discarded. The offending(?) code is in the format methods for objects of class POSIXlt, namely format.POSIXlt which is used by print.POSIXlt.

If we use the two values below as an example, format.POSIXlt uses the following line which I have wrapped in an sapply to test the absolute value of the difference between the fractional seconds rounded to successively greater number of digits.

secs <- c( 59.645998 , 59.645999 )
sapply( seq_len(np) - 1L , function(x) abs(secs - round(secs, x)) )
         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,] 0.354002 0.045998 0.004002 0.000002 0.000002 0.000002
[2,] 0.354001 0.045999 0.004001 0.000001 0.000001 0.000001

As you can see when the seconds are .xxx999 any rounding to 3 or more digits gives 0.000001 which affects the printing thus:

# the number of digits used for the fractional seconds is gotten here
np <- getOption("digits.secs")

# and the length of digits to be printed is controlled in this loop
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs, 
                i)) < 0.000001)) {
                np <- i
                break
            }

This is because 0.000001 as actually found in the above method is:

sprintf( "%.20f" , abs(secs[2] - round(secs,5)))
[1] "0.00000099999999991773"            

# In turn this is used to control the printing of the fractional seconds            
if (np == 0L) 
            "%Y-%m-%d %H:%M:%S"
        else paste0("%Y-%m-%d %H:%M:%OS", np) 

So the fractional seconds get truncated to only 3 decimal places because of the test used in rounding. I think if the test value in the for loop was set to 5e-7 this issue would disappear.

When the result returned is a vector of POSIXlt objects a different print method must be getting called.

Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
1

I haven't got a proper answer (keep looking into it) but I thought this was interesting:

times <- seq( 1366039619645990 , length.out = 11 )
# Convert to date/time wz="EST", origin="1970-01-01") + 5e-7
options(digits.secs = 6 )

test <- as.POSIXlt( times/1e6, tz="EST", origin="1970-01-01") + 5e-7

test1[1] <- NULL
for(i in 1:11)
  test1[i] <- as.POSIXlt(times[i]/1e6, tz="EST", origin="1970-01-01") + 5e-7

> identical(test, test1)
[1] TRUE

BTW, in single statements I got the same result as you...

> test
 [1] "2013-04-15 10:26:59.645990 EST" "2013-04-15 10:26:59.645991 EST" "2013-04-15 10:26:59.645992 EST"
 [4] "2013-04-15 10:26:59.645993 EST" "2013-04-15 10:26:59.645994 EST" "2013-04-15 10:26:59.645995 EST"
 [7] "2013-04-15 10:26:59.645996 EST" "2013-04-15 10:26:59.645997 EST" "2013-04-15 10:26:59.645998 EST"
[10] "2013-04-15 10:26:59.645999 EST" "2013-04-15 10:26:59.646000 EST"
> test[10]
[1] "2013-04-15 10:26:59.645 EST"
> as.POSIXlt( times[10]/1e6, tz="EST", origin="1970-01-01") + 5e-7
[1] "2013-04-15 10:26:59.645 EST"

Looking at the last two statements, it seems that this issue is mainly related to displaying the single value rather then a vector. But even in this case it would be a truncation, probably via floor, not a rounding.

Michele
  • 8,563
  • 6
  • 45
  • 72