0

I am building a time course dataframe. The time information is in seconds and created using seq(start, end, 0.001) and the start and end time points are based on values from a csv file.

Start and end time points are sometimes of the form "578.1799999999999" but sometimes the equivalent will be "578.180" or "578.18000000000001". This creates problems down the line when I'm trying to merge several time courses, since for R these are all different time points. Here a comparison from two time courses where each row should represent the same time point, however, some of them are different for R (df$test = df$t1 == df$t2):

enter image description here

To solve this problem, I tried to round all start and end time points to a precision of 3 digits, corresponding to ms, however, this is not working:

> round(578.1799999999999,3)
[1] 578.1799999999999

Does anyone have an idea how I can "force" R to round to 3 digits and forget the rest?

Max
  • 405
  • 2
  • 11
  • Does this answer your question? [Controlling number of decimal digits in print output in R](https://stackoverflow.com/questions/2287616/controlling-number-of-decimal-digits-in-print-output-in-r) – Julian Aug 30 '22 at 12:41
  • Unfortunately not, because it is not a problem of printing or displaying of the digits - when comparing both columns, R really sees t1 and t2 as different. – Max Aug 30 '22 at 12:44
  • `trunc(578.1799999999999*1e3)*1e-3` maybe. – jay.sf Aug 30 '22 at 12:47
  • Maybe [Is there a datatype "Decimal" in R?](https://stackoverflow.com/questions/16960167/is-there-a-datatype-decimal-in-r). – user2974951 Aug 30 '22 at 12:48
  • 1
    Definitely read this post: https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal – jay.sf Aug 30 '22 at 12:49
  • 2
    `test <- round(578.1799999999999,3)` is `[1] 578.18` and is also saved as such on my end? And `identical(round(578.18000000000001, 3), round(578.1799999999999, 3))` is `[1] TRUE` (as is `identical(5.174, round(5.17399999999999, 3))` as your example). We need a reproducible example to help you out, please see: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – harre Aug 30 '22 at 12:54

1 Answers1

1

I used the following a few times in the past but may not have used it here (or else we could flag it as a duplicate question). The approach is directly suggested by the question: what is the precision limit, which we turn around into what is the last power of ten with which R sees a difference?

Code

now <- Sys.time()
for (p in seq(1, 7)) {
    then <- now + 10^(-p)
    if (then == now)
        cat("** Indifferent at 10^-p for p=", p, "\n", sep="")
    else
        cat("Different at 10^-p for p=", p, "\n", sep="")
}

Output (under Linux)

$ Rscript answer.R 
Different at 10^-p for p=1
Different at 10^-p for p=2
Different at 10^-p for p=3
Different at 10^-p for p=4
Different at 10^-p for p=5
Different at 10^-p for p=6
** Indifferent at 10^-p for p=7
$ 

I think I convinced myself earlier with a similar analysis that the difference for POSIXct (stored as a double) is actually just a bit more than a microsecond. On Windows, as I recall, you only millisecond time from the system so you cannot run it the same way.

Edit: And as you changed your question into "how can I round, please": That is a different issue, related to R FAQ 7.31 discussed here a thousand times in different guises. Your best bet is to (explicitly) truncate or round as shown in the comments, and to format explicitly setting the desired precision.

> now <- Sys.time()
> format(now)   # six digits is my local default
[1] "2022-08-30 08:06:33.788717"
> options(digits.secs=3)
> format(now)   # set to three for this answer
[1] "2022-08-30 08:06:33.788"
> 

Note that display precision is not the same of compute precision for which the double resolution (and limits, see R FAQ 7.31 linked above) still apply.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725