3

An official microsoft recommendation on how to ensure that a file, representing a particular state, is written in a consistent way is to write it into a temporary file and ReplaceFile it.

But if we'll assume a bit more higher-level task - a change to the state represented in the file - it becomes a bit more problematic.

To make a change to a state within a file, you need to read the state from the file, make the change and write it back. While the "writing" portion we may consider be covered by the ReplaceFile function, the fact that the file could have been changed since we've read it is not.

In other words, we may need to check if the file is still the same and has not updated since, before the ReplaceFile call. If we are about talking text editors here - a modification time check before the call should be enough. But if we want something a bit more robust - we should acknowledge the possibility of file changing after the modification time check, but before the call.

The naive approach would be to implement a CompareAndReplaceFile call, that will lock the original file, check that it's the same file, then replicating what ReplaceFile does. Nor only this is a bit hacky solution (copy-pasting logic of a system function is not a good practice), but it also implies a longer lock period.

For instance, on Linux, the same effect could be achieved by utilizing fcntl(2)'s (FD_SETLEASE) file leasing to have a chance of aborting your operation once someone else opens a file for writing, prior to rename(2), which is atomic and does not open a file, so you can keep a lease through it.

Are there ways to implement a transactional file change on Windows, aside of a hacky solution discussed above?

Andrian Nord
  • 677
  • 4
  • 11
  • 1
    Can you re-phrase your question in a more salient way? Include what you have tried while you are at it. – ryyker Sep 11 '15 at 23:14
  • You can lock only the part of the file you're working on, if that's any help. See [LockFile](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365202(v=vs.85).aspx). – Harry Johnston Sep 12 '15 at 01:05

2 Answers2

1

When you open a file using CreateFile, you set the sharing mode. If you don't specify FILE_SHARE_WRITE, no one can open the file for write access until you close it the handle (and if the file is already open for write access, your attempt fails with a sharing violation).

Because ReplaceFile performs its operation with GENERIC_READ, DELETE, and SYNCHRONIZE flags and a FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE sharing mode, you can open your write handle using a sharing mode of FILE_SHARE_READ | FILE_SHARE_DELETE and keep it open until after the call to ReplaceFile, thus excluding the race condition.

If you're holding the content in memory (the text editor case), then when saving you would:

  • Reopen the file using GENERIC_WRITE and sharing mode FILE_SHARE_READ | FILE_SHARE_DELETE (if the original handle included FILE_SHARE_WRITE, didn't include GENERIC_WRITE, or has been closed after reading into the working buffer)
  • Perform the modification time check.
  • Write the changes to a new temporary file.
  • Call ReplaceFile
  • Close the handle to the replaced file.

If step one fails with a sharing violation, or step two reveals another change, you'll need to read the changed content, do a three-way merge, and start the process over.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • And what happens if someone else tries to do the same trick (opens the file with `FILE_SHARE_DELETE | FILE_SHARE_READ`, check mtime, call `ReplaceFile`) at the same time? In my understanding, `ReplaceFile` = `DeleteFile` + `CopyFile` (ignoring a backup part). `DeleteFile` won't actually remove a file until all descriptors are closed. Until then you won't be able to open the file unless you also specify `FILE_SHARED_DELETE`. So why it won't just fail if you have an open descriptor (can't copy new to original)? And if it won't , what prevents a concurrent `ReplaceFile` call not to fail too? – Andrian Nord Oct 18 '15 at 20:46
  • @AndrianNord: That's why step #1 is opening a handle with `GENERIC_WRITE` and not `FILE_SHARE_WRITE` and hold it throughout... only one of the competitors can get that access at once. – Ben Voigt Oct 18 '15 at 21:10
  • @AndrianNord: Also, `ReplaceFile` = 2x`MoveFile` – Ben Voigt Oct 18 '15 at 21:11
  • 1
    Well, ok, but what if someone else simply calls `ReplaceFile` without opening file for writing in prior? As `ReplaceFile` opens file with `FILE_SHARE_WRITE` sharing mode, it shouldn't prevent another `ReplaceFile` from slipping in. And what happens in that case? If A moves original file to a backup, then B should fail as there is no original file. May we assume that `MoveFile` in a `ReplaceFile` call is always atomic and only one of two simultaneous `ReplaceFile` calls may succeed? – Andrian Nord Nov 08 '15 at 10:51
0

Usually a file is first locked (LockFile on Win or flock for POSIX OS-es) before being replaced/updated. You can get a shared (read-only) lock, or an exclusive (read/write) lock, or both (first get a shared lock for reading and only after an exclusive lock when you are ready to replace/overwrite the file).

You can even check the modification timestamp of the file and override/ignore the lock if the owner process is not releasing the lock for whatever reason.

Radu
  • 1,098
  • 1
  • 11
  • 22
  • The `LockFile` approach only works for record-oriented files edited in place. If the change may change the size and move other data around, or you want to write the new file completely and atomically swap to prevent partial changes from being left after a crash, record locking doesn't help. – Ben Voigt Oct 15 '15 at 16:13
  • I dont believe that is true. LockFile/LockFileEx can be used for **any** atomic operations on the file, including full overwrites and file moves. Multiple approaches to do this: (a) lock byte range up to UINT_MAX -- this should guarantee that no other process can modify the file; (b) always lock the first byte of the file -- consider the first byte to be a lock for the whole file that guarantees exclusive access. Replacing the file also works in both cases as both style of locks guarantee exclusive access. Good luck! – Radu Oct 16 '15 at 11:35
  • Actually there is a potentially cleaner 3-rd option that allows atomic changes to multiple files that might come in handy: Create a new file that protects all the files/directory you want to change, acquire an exclusive lock on this file, generate a new set of files, replace the original files, and finally release the lock. – Radu Oct 16 '15 at 11:44
  • I disagree that it is cleaner, now you need a whole bunch of extra logic to handle an application crashing while it held the lock and not getting a chance to remove the extra file. – Ben Voigt Oct 16 '15 at 19:34
  • Locking a file works only with cooperative processes - i.e. when a file could only be read or written by processed making same locks. In general case it's not much different from CreateFile sharing mode (on Windows) or not much use at all (on Unix). – Andrian Nord Oct 18 '15 at 20:34
  • **My previous reply was to the comment that implied that file locking does not work. The last comments are on a different topic -- are related to the programming model**. I agree that one always needs to implement cooperation between processes irrespective of the synchronization primitives used. It is the same problem encountered when writing multi-threaded applications that use shared data structures. I don't personally don't see any way around it. I would be curios to see a synchronization method that is not vulnerable to race conditions, application hangs and does not require cooperation. – Radu Oct 19 '15 at 15:01
  • @Radu, I'd say that the answer below is pretty damn close to one. Obviously, if you have only cooperating processes accessing a resource you may do the same thing better/cleaner/easier, but for the sake of robustness even in that case you should always assume that some non-cooperative process is also present. Like an antivirus, for instance. – Andrian Nord Nov 08 '15 at 10:44
  • I still do not see how *any* solution can guard against a malicious process, a virus or attacker as you say. If a process has write access to the file it can do anything: keep the file open indefinitely (and prevent access), delete the file, corrupt the file, etc. *Cooperation between processes or non-maliciousness if you like is a must in all circumstances.* The only thing one can do is: a) prevent accidental corruption and b) prevent accidental race conditions. The two approaches both achieve this. I'm also curious to understand why you think file locking is useless or less easy/clean... – Radu Nov 10 '15 at 14:27