17

On Windows 7 and Mac OS 10.12.2 (with R 3.3.2), it appears that file.mtime() severely rounds or truncates timestamps. I verified that file.create("my_file.txt"); print(as.numeric(file.mtime("my_file.txt")), digits = 22) prints out several digits past the decimal on Linux, but everything past the decimal disappears on Windows 7 for the same my_file.txt. The behavior for Mac OS 10.12.2 is similar to that of Windows 7. Is there a platform-independent way to get precise file timestamps in R?

landau
  • 5,636
  • 1
  • 22
  • 50

2 Answers2

6

You can wait about 2 weeks, at which point R 3.3.3 will solve this problem (at least for Windows). From the NEWS file:

(Windows only.) file.info() now returns file timestamps including fractions of seconds; it has done so on other platforms since R 2.14.0. (NB: some filesystems do not record modification and access timestamps to sub-second resolution.)

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • Neat! That's great for Windows. I'm probably out of luck for [Mac OS](http://stackoverflow.com/questions/18403588/how-to-return-millisecond-information-for-file-access-on-mac-os-x-in-java), though. – landau Feb 11 '17 at 18:44
3

I think the new file.info is likely the best way to go. If R-3.3.3 does not bring what you need (or in the interim, if it will), you can try to side-step it by capitalizing on the fact that stat is likely installed in the base OS (I have not tested on a Mac):

as.POSIXct(system2("stat", args = c("-c", "%y", "my_file.txt"), stdout = TRUE))
# [1] "2017-02-15 11:24:13 PST"

This can be formalized in a function that does a skosh more for you:

my_mtime <- function(filenames, stat = c("modified", "birth", "access", "status"),
                     exe = Sys.which("stat")) {
  if (! nzchar(exe)) stop("'stat' not found")
  stat <- switch(match.arg(stat), birth = "%w", access = "%x", modified = "%y", status = "%z")
  filenames <- Sys.glob(filenames) # expand wildcards, remove missing files
  if (length(filenames)) {
    outs <- setNames(system2(exe, args = c("-c", stat, shQuote(filenames)), stdout = TRUE),
                     nm = filenames)
    as.POSIXct(outs)
  }
}

my_mtime("[bh]*")
#                  b-file.R                  h-file.R 
# "2017-02-14 05:46:34 PST" "2017-02-14 05:46:34 PST"

Since you asked for file.mtime, I'm assuming "modified" is the most interesting to you, but it's easy enough to include some other file timestamps:

my_mtime("[bh]*", stat="birth")
#                  b-file.R                  h-file.R 
# "2017-02-13 22:04:01 PST" "2017-02-13 22:04:01 PST" 
my_mtime("[bh]*", stat="status")
#                  b-file.R                  h-file.R 
# "2017-02-14 05:46:34 PST" "2017-02-14 05:46:34 PST" 

Note that the lack of fractional seconds is an artifact of printing (as you stated), this can be remedied:

x <- my_mtime("[bh]*", stat="status")
x
#                  b-file.R                  h-file.R 
# "2017-02-14 05:46:34 PST" "2017-02-14 05:46:34 PST" 
options(digits.secs = 6)
x
#                         b-file.R                         h-file.R 
# "2017-02-14 05:46:34.307046 PST" "2017-02-14 05:46:34.313038 PST" 
class(x)
# [1] "POSIXct" "POSIXt" 

Update: after testing on a Mac, I confirmed a couple of things (thanks to @HongOoi for the prod): (1) stat is indeed different, not supporting the same command-line options, so this script would need to be updated; and (2) this answer suggests that the filesystem is not even storing sub-second resolution on the file times. If your filesystem type is HFS+, I think there may be nothing to be done here. If the underlying filesystem is different, you may have better results.

It's true that Windows does not come with a stat executable. However, Git for Windows (which some argue is a necessity in an analyst/dev toolkit) does, under /Program Files/Git/usr/bin/stat.exe. (In fact, my hack above was written on Windows, tested second on Ubuntu.)

Bottom line, unfortunately, you may not get what you want/need on MacOS depending on your filesystem type. I could not get the installed stat to give sub-second resolution (even with its different arguments), suggesting that the 4 year old answer I referenced has not changed.

Community
  • 1
  • 1
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • stat is a Unix/Linux utility, not present on Windows. On Mac, the filesystem itself only stores times with 1-second resolution (until APFS arrives). This basically doesn't do anything more than R's `file.info` does already. – Hong Ooi Feb 17 '17 at 15:16
  • Note that Visual Studio comes with git integration built in, so you don't even need a standalone git install necessarily. – Hong Ooi Feb 17 '17 at 16:28
  • I didn't know that, thanks. Does VS come with other unix-y executables like `stat`? (Didn't think to check VS since it wasn't in the question, perhaps the OP can use this.) – r2evans Feb 17 '17 at 16:31
  • No, but on Windows 10, you can use stat itself [inside a bash shell.](https://msdn.microsoft.com/en-au/commandline/wsl/about) – Hong Ooi Feb 17 '17 at 16:32
  • Sure it is, and you can even "see" the executable at `%HOME%\AppData\Local\lxss\rootfs\usr\bin\stat`. To use that, you would then need to prepend `/mnt/c` to any full path you want stat-ed (relative paths would need to be converted). So you are right, this can work on Windows without GfW installed, good work. (I have been avoiding the bash-subsystem due to the difficulty of integrating it with non-bash-subsystem tools.) – r2evans Feb 17 '17 at 16:51
  • As a (late) follow-up: for R to be able to use the WSL-based utilities, you would need to be running R within the WSL itself. As it stands, R-in-windows can execute windows binaries; R-in-wsl can execute the linux binaries; neither system can execute the other binary format. So though @HongOoi's comment about being able to use the WSL shell is correct, it is not without limitation or consequence. – r2evans Mar 13 '17 at 19:41