27

Suppose there are 2 processes P1 and P2, and they access a shared file Foo.txt.

Suppose P2 is reading from Foo.txt. I don't want P1 to write to Foo.txt while P2 is reading it.

So I thought I could make P1 write to Foo.tmp and as a last step, rename Foo.tmp to Foo.txt. My programming language is Java

So my question is, would this ensure that P2 reads the correct data from Foo.txt? Would the rename operation be committed once P2 completes reading the file?

EDIT

I tried to recreate this scenario as follows:

My P1 code is something like this:

File tempFile = new File(path1);
File realFile = new File(path2);
BufferedWriter writer = new BufferedWriter(new FileWriter(tempFile));
for(int i=0;i<10000;i++)
    writer.write("Hello World\n");
writer.flush();
writer.close();
tempFile.renameTo(realFile);

and my P2 code is :

BufferedReader br = new BufferedReader(new FileReader(file)); 
String line = null;
while(true) {
  while((line=br.readLine())!=null){
      System.out.println(line);
      Thread.sleep(1000);
  }
  br.close();
}

My Sample shared File:

Test Input
Test Input
Test Input   

I'm starting P1 and P2 almost simulataneously (P2 starting first).

So according to my understanding, even though P1 has written a new Foo.txt, since P2 is already reading it, it should read the old Foo.txt content until it re-opens a BufferedReader to Foo.txt.

But what actually happens is P2 reads Test Input thrice, as is expected from the input, but after that it reads the new content which was written by P1.

Output from P2:

Test Input
Test Input
Test Input 
Hello World
Hello World
Hello World
 .
 .
 .

So it doesn't work as it should. Am I testing this scenario wrong? I feel like there's something I'm missing out.

Chaos
  • 11,213
  • 14
  • 42
  • 69
  • Have you considered using a database for something like this? You'd get a history, and your second process can just `select text from foo where insertstamp = max(insertstamp)` – corsiKa Sep 09 '13 at 20:36
  • @corsiKa, using a database isn't possible. The shared file in question interacts with a lot of modules which would have to be changed. – Chaos Sep 09 '13 at 20:47
  • This is indeed a common idiom in Unix (atomic-rename). Though in C, there is usually a `fsync()` before the `rename()`, to provide ordering guarantees even when there are OS/hardware crashes. – ninjalj Sep 09 '13 at 22:25
  • your P2 code looks like you're closing the file each time in the loop, so the next iteration will have to reopen the file (which will get the new file if it has since been changed). With no explicit reopen, I would expect it to fail, but I don't know java, so I guess it is implicitly reopening the file? – Chris Dodd Sep 10 '13 at 02:10
  • I don't think it is re-opening the file at every iteration. The reason I say this is because I'm running P1 after 1 iteration of P2 (when it prints `Test Input` once. So if it's reopening the file at every iteration, at the 2nd iteration, it should print the new data. But it prints `Test Input` thrice, which means it isn't closing the file – Chaos Sep 10 '13 at 03:46

3 Answers3

41

A UNIX rename operation is atomic (see rename(2)). The UNIX mv command uses rename if the source and target path are on the same physical device. If the target path is on a different device, the rename will fail, and mv will copy the file (which is not atomic).

If the target file path exists, the rename will atomically remove it from the file system and replace it with the new file. The file won't actually be deleted until its reference count drops to zero, so if another process is currently reading the file, it will keep reading the old file. Once all processes have closed the old file, its reference count will drop to zero and the file storage space will be reclaimed.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • Are disks in RAID setup also seen as the same physical device? – koenmetsu May 12 '16 at 05:41
  • 1
    This is a great answer. FWIW, here are 2 relevant pages for mv and rename that outline why this is true. https://man7.org/linux/man-pages/man2/rename.2.html https://man7.org/linux/man-pages/man1/mv.1p.html – The Dude Aug 29 '22 at 19:52
6

why not use FileChannel.lock ?

here is an example:

http://examples.javacodegeeks.com/core-java/nio/filelock/create-shared-file-lock-on-file/

Kent
  • 189,393
  • 32
  • 233
  • 301
  • 1
    Yes, I ended up using this after I posted my question, but I'm still curious about how this would work – Chaos Sep 09 '13 at 20:34
4
  1. move(rename) is atomic if done on the same device. (device = same disk/partition)
  2. If Foo.txt exits move Foo.tmp to Foo.txt most likely will fail. (But if you first delete Foo.txt and then move, it should work). What happens is that a file is not physically deleted until all file handlers are closed (there is no process that uses that file). Also, after remaining Foo.tmp to Foo.txt you will have 2 Foo.txt files. One that is deleted but still opened in memory (basically that file does not have a reference on disk anymore) and one that actually resides on disk.
  3. But, after move, in second process you need to reopen the file.

Let me know if we are on the same page with #1.

Claudiu
  • 1,469
  • 13
  • 21