5

I asked a question previously (see: Sequence of Timestamps with Milliseconds) however for some reason when my time starts as 00:00:00 my code does not work.

I would like to get a sequence of 10hz times from one time to another. However this code is giving me:

1   2018-06-01 00:00:00.000
2   2018-06-01 00:00:00.101
3   2018-06-01 00:00:00.202
4   2018-06-01 00:00:00.303
5   2018-06-01 00:00:00.404

When I need:

1   2018-06-01 00:00:00.000
2   2018-06-01 00:00:00.100
3   2018-06-01 00:00:00.200
4   2018-06-01 00:00:00.300
5   2018-06-01 00:00:00.400

Code:

options(digits.secs=3)
Time1 ="2018-06-01 00:00:00"
Time2 ="2018-06-01 00:00:10"
Time1 =as.POSIXct(Time1, format="%Y-%m-%d %H:%M:%OS", tz='UTC')
Time2 =as.POSIXct(Time2, format="%Y-%m-%d %H:%M:%OS", tz='UTC')
library(stringr)
dif_T2_T1 <- difftime(Time1, Time2, units = 'secs')
pattern <- '(\\d)+'
n <- as.numeric(str_extract(dif_T2_T1, pattern = pattern)) * 10
df_blank  <- data.frame(Timestamp = as.character(seq.POSIXt(Time1, Time2, units = 'seconds', length.out = n)))
megmac
  • 547
  • 2
  • 11
  • I think you want `length.out=n+1`. – Edward Jun 14 '20 at 00:22
  • how is the timestamp string matching going, from database, experimental data? – Chris Jun 16 '20 at 14:59
  • I got it to work, I essentially have wind measurements that occasionally have missing times which need to be filled with NA's. The most reliable way I have found is to make a list of times from the first to the last then merge the data sets. – megmac Jun 17 '20 at 14:11

1 Answers1

6

Due to rounding down issues with milliseconds in R (see this post), you need to add a tiny fractional amount to the vector. And the length.out should be n+1, not n.

df_blank  <- data.frame(Timestamp = seq.POSIXt(Time1, Time2, length.out=n+1) + 0.0001)
head(df_blank)
#                Timestamp
#1 2018-06-01 00:00:00.000
#2 2018-06-01 00:00:00.100
#3 2018-06-01 00:00:00.200
#4 2018-06-01 00:00:00.300
#5 2018-06-01 00:00:00.400
#6 2018-06-01 00:00:00.500

Without the addition of the tiny amount, you can see the problem.

df_blank  <- data.frame(Timestamp = seq.POSIXt(Time1, Time2, length.out=n+1))

head(format(df_blank, "%Y-%m-%d %H:%M:%OS6"))
#                   Timestamp
#1 2018-06-01 00:00:00.000000
#2 2018-06-01 00:00:00.099999
#3 2018-06-01 00:00:00.200000
#4 2018-06-01 00:00:00.299999
#5 2018-06-01 00:00:00.400000
#6 2018-06-01 00:00:00.500000

And without the formatting, you see what appears to be a very strange sequence.

head(df_blank)
#              Timestamp
#1 2018-06-01 00:00:00.0
#2 2018-06-01 00:00:00.0
#3 2018-06-01 00:00:00.2
#4 2018-06-01 00:00:00.2
#5 2018-06-01 00:00:00.4
#6 2018-06-01 00:00:00.5
Edward
  • 10,360
  • 2
  • 11
  • 26
  • This seemed to work but if you go out to 6 decimal places the times are still not right. – megmac Jun 14 '20 at 02:52
  • 1
    That's just a formatting issue. This is a long-standing floating-point problem that will _bug_ us until the end of time. See https://stackoverflow.com/questions/8889554/milliseconds-puzzle-when-calling-strptime-in-r and https://stackoverflow.com/questions/9787509/r-xts-001-millisecond-in-index and https://stackoverflow.com/questions/15383057/accurately-converting-from-character-posixct-character-with-sub-millisecond-da – Edward Jun 14 '20 at 02:56
  • okay still vaguely confused on how to fix this but I think this helped put me in the right direction. The issue with it not being a flat millisecond is that I have to match to other timestamps. – megmac Jun 14 '20 at 03:26
  • In that case converting them to character and truncating may do the trick. I didn't quite understand why you included `as.character` in your code, but your last comment has enlightened me. Perhaps you should have mentioned that vital bit of information in your question. – Edward Jun 14 '20 at 03:38