1

I started a thread over here, to ask about "concurrent" writes to an XML file, and it got flagged as a duplicate and referenced here, to a thread that talks about creating a lock file in the same folder as the write file, as a means of handling the situation. This seems inelegant to me, to be writing to the network with a hidden file, especially when we have the ability to lock a file, just not the ability (it seems) to lock a file and then, you know, do something with it. So, my thought is to take a different approach. 1: Lock the file with $file = [IO.File]::Open($path, 'Open', 'ReadWrite', 'None') I have verified I can't lock it twice, so only one instance of my code can have a lock at any one time. 2: Copy-Item to the local temp folder. 3: Read that copy and append data as needed. 4: Save back over the temp file. 5: Remove the lock with $file.Close() 6: Immediately Copy-Item the temp file back over the original. The risk seems to be between 5 & 6. Another instance could set a lock after the first instance removes the lock, but before it overwrites the file with the revised temp file.

Is that risk the reason for the separate lock file approach? Because then the "lock" stays in place until after the revisions are saved?

It all seems like so much nasty kludge for something that I would think .NET or Powershell should handle. I mean, a StreamReaderWriter that has a -lock parameter, and allows you to pull the file in, mess with it, and save it, just seems so basic and fundamental I can't believe it's something that isn't built in.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Gordon
  • 6,257
  • 6
  • 36
  • 89

2 Answers2

1

A practice that is often used – by mutual cooperation of all applications – is a "sentinel file" or "lock file." Sometimes the mere presence of the file is enough; sometimes it becomes "the file that you lock."

All of the applications must understand your convention and must respect it. This will allow you to manipulate the XML file without interference by other cooperating applications.

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41
0

Mike Robinson's helpful answer sums up best practices well.

As for your question:

The risk seems to be between 5 & 6. Another instance could set a lock after the first instance removes the lock, but before it overwrites the file with the revised temp file.

Is that risk the reason for the separate lock file approach? Because then the "lock" stays in place until after the revisions are saved?

Yes, a separate lock file that cooperating processes respect will prevent such race conditions.

However, it is possible to solve this problem without a lock file, albeit at the expense of how long it takes to perform updates:

  • A lock file allows you to "claim" the file without yet locking it exclusively, so you can prepare the update while other processes can still read the file. You can then limit exclusive locking to the act of rewriting / replacing the file using previously prepared content.

    • In other words: by convention, the lock file guarantees the atomicity of the update operation, but minimizes its duration by limiting exclusive locking to just the act of rewriting / replacing (excluding the time spent on reading and update preparation).
  • Alternatively, you can guarantee atomicity by opening the file with an exclusive lock that you don't release until you've read, modified, and saved back.

    • In other words: The implementation becomes simpler (no lock file), but updates take longer.

      • This answer to your other question demonstrates the technique.
    • Even then, however, you need cooperation from the the other processes: that is, both readers and writers need to be prepared to retry in a loop if opening the file for reading fails (temporarily) during an ongoing update operation.

Community
  • 1
  • 1
mklement0
  • 382,024
  • 64
  • 607
  • 775
  • I think I disagree with your "longer than necessary" remark. This would make the transaction truly atomic and would be no different than using `select .. for update` in sql. If the update that OP wants to make to the file depends on the current contents of the file, then "longer than necessary" is actually necessary. Other than that, this is a good answer. – mhhollomon Dec 16 '18 at 10:58
  • @mhhollomon: It's the presence of a _lock file_ that would guarantee atomicity, (see [this answer](https://stackoverflow.com/a/52840732/45375) linked to from the question post). And that lock file would be created just before _reading_ the document, yet the file wouldn't be locked to other _readers_, only to would-be _writers_. The exclusive lock is only obtained when it comes time to actually _replace_ / _rewrite_ the file, _after_ reading and modifying (of an in-memory / temporary file copy) has completed. That way, other _readers_ aren't locked out while the update is being _prepared_. – mklement0 Dec 16 '18 at 13:48
  • Ah, educational thread, at least for me. So, my thinking was that the [IO.File]::Open approach is nice because the OS enforces the lock, so I don't have to depend on all code respecting my sentinel file. Given that NotePad ++, and Notepad, and any other simple text editor wouldn't, that seems valuable. And given that lockout time is probably not a concern, I am curious how I could go about using $file to edit and save the XML. I have yet to make that work, the lock blocks all my attempts and getting at the XML. – Gordon Dec 16 '18 at 13:49
  • 1
    @Gordon: Please see the updated answer and the [answer to your previous question](https://stackoverflow.com/a/53805653/45375) that I've just added – mklement0 Dec 16 '18 at 20:03