12

I'm writing a software for Linux which would actively work with user's files in background concurrently with other applications that I don't control. I want to make my background application to not overwrite changes made by other applications. But there is a problem - unlike Windows Linux doesn't provide mandatory file locking capability which creates possibility of ruining user's work due to race conditions which I'd like to avoid.

So I wonder - are there file-systems available on Linux that provide some kind of synchronization mechanisms such as compare-and-swap operation, all-or-nothing transactions, mandatory file locking (like in Windows)?

Gill Bates
  • 14,330
  • 23
  • 70
  • 138
  • You missed your problem statement. What exactly are you trying to do? – Maxim Egorushkin Sep 25 '18 at 11:59
  • @MaximEgorushkin, _"I want to make it not overwrite changes made by user"_ - it's not enough? – Gill Bates Sep 25 '18 at 13:01
  • This problem statement has a simple solution: write into new files. – Maxim Egorushkin Sep 25 '18 at 13:57
  • The user processes may be using buffered I/O (e.g. stdio functions), so that user writes may be pending in the user-space buffers. But a file observer cannot possibly know that. Your problem as stated doesn't have a solution. – Maxim Egorushkin Sep 25 '18 at 13:58
  • See [Mandatory file lock on Linux](https://stackoverflow.com/questions/12062466/mandatory-file-lock-on-linux) – Barmar Sep 25 '18 at 21:08
  • @MaximEgorushkin, I worry more about scenario where user flushes the buffer and closes the file but my application overwrites the file making the changes made the user irreversibly lost. The situation that you described concern me less because most of the text editors are able to detect when the opened file is changed on the disk. – Gill Bates Sep 26 '18 at 08:07

4 Answers4

4

I believe there are three possible solutions

1) Make all programs to use a custom file I/O library that implement the features that you need. This solution may not be feasible if you do not have access to the source code. You may also consider to use mmap so that changes are written to memory. You use a background process to synchronize dirty pages to existing or new files.

2) Replace standard C/C++ libraries (such as libc.so) that affected programs would use. You could use ldd to find out library dependency. You need to update source code for standard C/C++ to implement features that you need. This may be too difficult for most people.

3) Create your file system. You may refer to many articles in the internet, such as https://kukuruku.co/post/writing-a-file-system-in-linux-kernel/. This is the best and cleanest solution.

Hope it helps.

yoonghm
  • 4,198
  • 1
  • 32
  • 48
  • 1
    I think I'll go with option 3) - write a wrapper over regular file system with FUSE which would add mandatory locking functionality or CAS operation. – Gill Bates Sep 29 '18 at 12:46
3

Rename is atomic. It is up to your application to compare "eTags" of source and destination (possibly under appropriate locks) before deciding on calling rename().

itisravi
  • 3,406
  • 3
  • 23
  • 30
  • They key thing for me is `compare-and-swap` capability. – Gill Bates Sep 21 '18 at 06:38
  • File systems don't provide compare and swap based on file attributes. Your program must do it. Also see https://stackoverflow.com/questions/28417765/compare-and-swap-over-posix-compliant-filesystem-objects – itisravi Sep 21 '18 at 11:39
  • _Your program must do it._ I need synchronization not only within the application but across all the application that access the filesystem. – Gill Bates Sep 21 '18 at 12:21
  • mhmm, I'm not sure that is possible. Hopefully someone has an answer. – itisravi Sep 22 '18 at 06:39
  • I'm upvoting this answer because your question has potentially enormously complex use cases and a lot of corners, but the key to answering it lies in making the final write-to-file be atomic, and no one else stated that. Given an atomic method to replace, you can write to temporary locations, compare timestamps and collate changes in your program; you don't need to worry about acquiring a write lock or corrupting the definitive file. If you need read-for-update locking, you need a database. – joshstrike Sep 30 '18 at 05:32
3

mmap seems to have this kind of protection your looking for: https://www.kernel.org/doc/html/v4.13/media/uapi/v4l/func-mmap.html

prot The prot argument describes the desired memory protection. Regardless of the device type and the direction of data exchange it should be set to PROT_READ | PROT_WRITE, permitting read and write access to image buffers. Drivers should support at least this combination of flags.

Yuri Yaryshev
  • 991
  • 6
  • 24
2

Shared resources require protection from concurrent access because if multiple threads of execution access and manipulate the data at the same time, the threads may overwrite each other's changes or access data while it is in an inconsistent state. Concurrent access of shared data is a recipe for instability that often proves very hard to track down and debug—getting it right off the bat is important [1]

Threads can use the following two tools to synchronize their actions: mutexes and condition variables [2]

Mutexes (short for mutual exclusion) allow threads to synchronize their use of a shared resource, so that, for example, one thread doesn’t try to access a shared variable at the same time as another thread is modifying it.

Condition variables perform a complementary task: they allow threads to inform each other that a shared variable (or other shared resource) has changed state.

Adapted from:

[1] Love, R. (2005). Linux Kernel Development, Second Edition.

[2] Kerrish, M. (2010). The Linux Programming Interface.

Tiago Martins Peres
  • 14,289
  • 18
  • 86
  • 145