3

I have a non-Java program writing files to a directory. My Java program then copies these files to a different directory. Is there a way for Java to check if the last file is fully written to before being copied? This is because when I copy the files, the last file is still being written/processed to by the other program and the Java copy results in only a partial copy of it.

I have looked at doing something like this (Taken from answer here: https://stackoverflow.com/a/17603619/19297684):

boolean success = potentiallyIncompleteFile.renameTo(stagingAreaFile);

where the program attempts to rename the last file. However, this is not platform agnostic.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
MKRabbit
  • 49
  • 5
  • No, there's no way to check if a different program is writing to a file. You can skip processing the file with the latest creation date. – Gilbert Le Blanc Apr 28 '23 at 15:35
  • 3
    You can avoid much of the conflict by having the producer of files write to "xyz.abc.tmp" and rename after completion to "xyz.abc", then make your Java app scan only for ".abc" and ignore ".tmp" files. – DuncG Apr 28 '23 at 15:41
  • 1
    You *can* check whether a file is still opened by another process. However, as you say yourself there is no platform agnostic way of doing this, and you probably have to drop into JNI to accomplish this (at least on some platforms; on Linux you might get away with procfs, ignoring potential race conditions resulting from this). – Konrad Rudolph Apr 28 '23 at 15:43
  • The easiest way would be to use a tool like [`flock`](https://www.man7.org/linux/man-pages/man2/flock.2.html) to communicate between the two processes, but that requires the writing process to collaborate with you and agree to hold the lock. Do you have any control over that process? – Silvio Mayolo Apr 28 '23 at 15:44
  • @SilvioMayolo Unfortunately, I do not have any control over the producer. – MKRabbit May 02 '23 at 09:20

2 Answers2

0

There is no reliable and OS independent way for Java to get this solved.

Other platforms that need to solve this problem (e.g. Apache Camel) use various strategies (some of them already mentioned in the comments to your question):

  • rename the file after it was completely written
  • create the file in a different folder and move it to the folder that Java is watching
  • create an additional trigger file that Java can watch when the original file to process is ready
  • create a time based schedule, e.g. your file is always created at 0:00 and 12:00 of a day, the creation takes maximum one hour, then you can instruct Java to do the processing at 2:00 and 14:00
  • set up other means of process synchronization

There is no general solution that fits all individual problems.

cyberbrain
  • 3,433
  • 1
  • 12
  • 22
0

After looking around for a bit, I believe I found a workaround for my situation. I tested out copyFile method from org.apache.commons.io.FileUtils. This method throws an IOException if the length of copied file is not the same as the original file. Source: https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/FileUtils.html#copyFile(java.io.File,%20java.io.File)

So something like this:

try {
  org.apache.commons.io.FileUtils.copyFile("srcFile.txt", "destFile.txt");
} catch (IOException e) {
  LOGGER.error("IOException: ", e);

  retryThisMethod();
}

If an IOException is thrown, I initiate a retry for the copy.

MKRabbit
  • 49
  • 5