13

I have a data file and from time to time I need to write a change to the file. The change consists of changing information in more than one place. For example, changing some data near the end of the file and also changing some information near the start. I want the two separate writes to either both succeed or both fail, otherwise it is left in uncertain state and effectively corrupted. Is there any builtin support for this scenario in .NET or in general?

If not then how to others solve this issue? How does a database on Windows solve this issue?

UPDATE: I do not want to use the Transactional NTFS capability because it is not available on older version of Windows such as XP and it is slow in the file overwrite scenario as described above.

Phil Wright
  • 22,580
  • 14
  • 83
  • 137
  • 1
    Check this link: http://stackoverflow.com/questions/3267595/how-do-i-open-a-transacted-file-in-c – fatty Sep 04 '11 at 07:34
  • What filesystem(s) are you going to be running on? – Oded Sep 04 '11 at 07:35
  • Looking at Transactional NTFS it states that performance of overwrites in a file are very slow. This is my exact scenario and so it looks very slow as an approach. How does a database manage the same situation without the use of TxNTFS? – Phil Wright Sep 04 '11 at 07:48
  • 3
    @PhilWright: How databases work is the subject of a huge body of literature (books, academic papers, ...) over pretty much the whole history of IT. Far more than can be covered in a SO answer. A quick web search should find some introductory content (if you can avoid the noise). Wikipedia should get you [started](http://en.wikipedia.org/wiki/ACID#Implementation). – Richard Sep 04 '11 at 07:50
  • I agree with the OP that this question is not a duplicate with the linked question. This question here is more general than the other, which is specific to Windows 7. – stakx - no longer contributing Sep 04 '11 at 08:36
  • @Richard, I think he means how would a database allow you to write **two different OS files** in a transaction? – Pacerier Jan 27 '17 at 18:05

2 Answers2

5

DB basically uses a Journal concept (at least those one I'm aware of). An idea is, that a write operation is written in journal until Writer doesn't commit a transaction. (Sure it's just basic description, it's so easy)

In your case, it could be a copy of your file, where you're going to write a data, and if everything finished with success, substitute original file with it's copy.

Substitution is: rename original file like a old, rename backup file like a original.

If substitution fails: this is a critical error, that application should handle via fault tolerance strategies. Could be that it informed a user about a failed save operation, and tries to recover. By the way in any moment you have both copies of your file. That one when write operation just started, and that one when write operation finished.

This techniques we used on past projects on VS IDE like systems for industrial control with pretty good success.

Tigran
  • 61,654
  • 8
  • 86
  • 123
  • Pretty cool, you might want to flesh-out this answer a bit more. For example, is there any particular thing to watch out for? – Pacerier Jan 27 '17 at 18:09
  • I found a related article: https://dutherenverseauborddelatable.wordpress.com/2014/02/05/is-my-data-on-the-disk-safety-properties-of-os-file-writeatomic/ – Pacerier Jan 30 '17 at 12:22
4

If you are using Windows 6 or later (Vista/7/2008/2008R2) the NTFS filesystem supports transactions (including within a distributed transaction): but you will need to use P/Invoke to call Win32 APIs (see this question).

If you need to run on older versions of Windows, or non-NTFS partitions you would need to perform the transactions yourself. This is decidedly non-trivial: getting full ACID functionality while handling multiple processes (including remote access via shares) across process and system crashes even with the assumption that only your access methods will be used (some other process using normal Win32 APIs would of course break things).

In this case a database will almost certainly be easier: there are a number of in-process databases (SQL Compact Edition, SQL Lite, ...) so a database doesn't require a server process.

Community
  • 1
  • 1
Richard
  • 106,783
  • 21
  • 203
  • 265
  • 1
    It's worth noting that Microsoft is considering [deprecating TxF](http://stackoverflow.com/q/13420643/69809) in future Windows versions. – vgru Nov 13 '14 at 08:59