How to safely share a file descriptor between two processes?

Question

This is a follow-up question, UNIX-focused, to my previous question here.

I was wondering whether a file descriptor open by a process, could safely be used in forked processes.

I've run a few tests by running several hundreds processes at the same time, all writing continuously to the same file descriptor. I found out that:

when fwrite() calls are up to 8192 bytes, all calls are perfectly serialized and the file is OK.
when fwrite() calls are more than 8192 bytes, the string is split into 8192 byte chunks that get randomly written to the file, which ends up corrupted.

I tried to use flock(), without success as every process tries to lock/unlock the same file descriptor, which does not make sense. The outcome is the same.

Is there a way to safely share the file descriptor between all the processes, and get all fwrite() calls properly serialized?

Related: http://stackoverflow.com/questions/14417806/are-posix-read-and-write-system-calls-atomic — Celada, Jul 04 '13 at 16:47

score 3 · Answer 1 · answered Jul 04 '13 at 16:57

3

The first thing you need to pay attention to is stdio buffers. Because you are using stdio (fwrite()) and not system calls directlr (write()), you don't know when the data will actually get flushed to the file. To bypass this issue, you will have to flush stdio buffers inside your critical section each time before you release the lock:

take the lock
fwrite(foo, ...);
fflush(thefile);
release the lock

...or you can switch to using write() directly.

Now, on to the main issue: how to lock the file so that only one process at a time has exclusive access to the file.

You may or may not be able to use flock(). It depends on how the different processes obtained file descriptors to the same file. flock() locks are associated with an open file table entry. Because fork() and dup() create new file descriptors that refer to the same file table entry, they are the same object from flock()'s point of view and so you can't use flock() in this case. If, on the other hand, each process opened its own copy of the file with open() directly, then you can use flock().

fcntl()-style locking does not suffer from this problem (it suffers from a different type of problem instead!). fcntl() locks are per-process, so it doesn't matter how the processes obtained file descriptors to the same file.

So I suggest you try with fcntl()-style locking:

struct flock ll;

/* lock */
ll.l_start = ll.l_len = ll.l_whence = 0; /* lock the whole file */
ll.l_type = F_WRLCK; /* exclusive lock */
fcntl(fd, F_SETLKW /* or F_SETLK */, &ll);

/* unlock */
ll.l_type = F_UNLCK;
fcntl(fd, F_SETLKW /* or F_SETLK */, &ll);

answered Jul 04 '13 at 16:57

Celada

21,627
4
64
78

Thanks, can you please expand on what problems `fcntl()` suffers from? – BenMorel Jul 04 '13 at 17:02
The main problem with `fcntl()`'s per-process locking is that if two different parts of the same process operate on the same file (say, 2 different libraries that the main application uses but which don't know about each other) then they will conflict with each other. Since the lock is per-process, unlocking the file using *one* file descriptor releases the lock from the point of view of all *other* file descriptors refering to the same file too. See for example http://www.perlmonks.org/bare/?node_id=392435 – Celada Jul 04 '13 at 17:10
Ok, that makes sense! Alternatively, is it possible to know on a given system the max length at which `fwrite()` (or `write()`) calls are atomic (1023 in my case)? This is only for a file logger, so writes are expected to be quite small anyway, but it would be nice to know programmatically whether a given `write()` will be atomic or not given its length. – BenMorel Jul 04 '13 at 17:16
See http://stackoverflow.com/questions/14417806/are-posix-read-and-write-system-calls-atomic for details on that. Summary: if you are writing to pipes, the max size for atomic writes is `PIPE_BUF`, and if you are writing to anything else than pipes, then the writes are NEVER atomic. – Celada Jul 04 '13 at 17:31
Yes I've read this link, but the weird thing is that although I'm not writing to pipes, my experience shows that the writes *are* atomic up to 1023 bytes. Any explanation welcome! – BenMorel Jul 04 '13 at 17:49
Sorry, it was actually `8192` bytes, not `1023`. That was a mistake in my code that checked the output file. I've updated my question to reflect that. – BenMorel Jul 04 '13 at 18:18
No, your experience shows that the writes happen to be atomic under the specific conditions you happen to be testing under (including system load, scheduler configuration, operating system version, C library version, phase of the moon) but this doesn't tell you anything about what the system guarantees will be atomic. – Celada Jul 05 '13 at 00:28

William Pursell · Accepted Answer · 2013-07-04T22:05:44.170

2

The way you are using the file descriptor is perfectly safe. The writes are not synchronized, so the output is perhaps not what you expect, but there is nothing "unsafe" about that. If you are asking "how to synchronize writes to a common file descriptor", then the answer is obvious: use a synchronization mechanism. In the case you describe in which one file descriptor is shared amongst multiple processes, perhaps the easiest thing to do is to pass a token on a second pipe. Have a pipe that is shared amongst all the processes and write a single character into it. When a process wants to write to the first fd, it attempts to read a character from the pipe. When the read is successful, proceed with the write (make sure to flush) and then write a character back into the pipe. When any process is in the critical section (ie, it has read but not yet written the token), any other process will block on the read until the other process completes its write, as long as you don't forget to flush! This is a fairly expensive operation, since it requires each process to keep two additional file descriptors open and the upper bound on the number of available file descriptors is fairly low, typically on the order of 1024, but it is often very effective.

edited Jul 04 '13 at 22:05

answered Jul 04 '13 at 21:51

William Pursell

204,365
48
270
300

Very interesting explanation, thanks. That being said, I'm only using the shared file descriptor to write to a log, one line at a time. So it's very unlikely that I need a locking mechanism at all, provided that I stay under these 8192 bytes. Do you know where this number (that I got empirically) is coming from? – BenMorel Jul 04 '13 at 22:15
`writes` are guaranteed to be atomic up to a certain value (PIPE_BUF) depending on the system. Very likely, in your case, that value is 8192. It is possible that the actual value is smaller (512 and 4096 are common). Check the value of PIPE_BUF. Search for "atomic" in http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html – William Pursell Jul 05 '13 at 00:32
In his answer and comments below, @Celada says that PIP_BUF only applies to pipes. Why does it apply when I'm writing to a file then? – BenMorel Jul 05 '13 at 09:32
Celada is correct; the standard only guarantees atomicity for pipes. Very likely, the implementation uses similar mechanisms for regular files and you are getting atomic writes as a side effect, but this is not behavior that can be relied on. – William Pursell Jul 05 '13 at 13:43

How to safely share a file descriptor between two processes?

2 Answers2