We are writing a multi threaded application that does a bunch of bit twiddling and writes the binary data to disk. Is it possible to have each thread std::fopen
the same file for writing at the same time? The reasoning would be each thread could do its work and have its own access to the writable file.
-
Far as I know, writing to a hard-disk is not really a threadable thing (as in, only one thread can do it at a time). I could be wrong on that, and others might be more able to shed light on this. It might also be an OS-specific thing. To closer address your question, I believe most OSs are going to lock out subsequent write access to the same file if a handle to that file is already open. – David Peterson Mar 18 '14 at 02:57
-
http://stackoverflow.com/questions/7842511/safe-to-have-multiple-processes-writing-to-the-same-file-at-the-same-time-cent – oakad Mar 18 '14 at 03:17
-
Concurrent access to a single file: solved problem. Long, long ago solved. – Carey Gregory Mar 18 '14 at 03:36
-
Well, there's not enough information to answer this question. Are all these threads appending data or writing to random locations? Without knowing the answer to that question nobody can answer your question. – Carey Gregory Mar 18 '14 at 05:46
4 Answers
std::fstream
has functionality defined in terms of the C stdio library. I would be surprised if it were actually specified, but the most likely behavior from opening the same file twice is multiple internal buffers bound to the same file descriptor.
The usual way to simultaneously write to multiple points in the same file is POSIX pwrite
or writev
. This functionality is not wrapped by C stdio, and by extension not by C++ iostreams either. But, having multiple descriptors to the same filesystem file might work too.
Edit: POSIX open
called twice on the same file in Mac OS X produces different file descriptors. So, it might work on your platform, but it's probably not portable.
A definitive answer would require connecting these dots:
- Where the C++ standard specifies that
fstream
works like a C (stdio) stream. - Where the C standard defines when a stream is created (
fopen
is only defined to associate a stream with a newly-opened file). - Where the POSIX standard defines its requirements for C streams.
- Where POSIX defines the case of opening the same file twice.
This is a bit more research than I'm up for at the moment, but I'm sure someone out there has done the legwork.

- 134,909
- 25
- 265
- 421
-
1Ugh… someone had added well-researched comments here adding much more depth, but a moderator deleted them. I *seem to recall* that the OS X behavior is guaranteed by POSIX. – Potatoswatter Nov 09 '16 at 14:41
I've written some high speed multi-threaded data capture utilities, but the output went to separate files on separate hard drives, and then were post-processed.

- 27,407
- 3
- 36
- 61
I seem to recall that you can have fopen not lock the file so in theory that would allow different threads to all write to the same file with independent handles. In practice you're going to run into other issues, namely concurrency. Your threads are almost certainly going to step all over each other and scramble the results unless you implement some synchronization. And if you have to do that, why not just use one handle across all the threads?

- 1,508
- 11
- 16
I/O access is not a parallelizable task (it can't be, you simply can't send two or more data addresses over the device bus at the same time) so you'd better implement a queue in which many threads posts their chunks of data and one single consumer actually writes them to disk.

- 11,129
- 19
- 93
- 159
-
Sure you can't: "With TCQ, the drive can make its own decisions about how to order the requests". TCQ is just a technology which improves drive performance by sorting and packing the I/O requests in the most efficient way... there's no magic such concurrent writes :-) – Gianluca Ghettini Mar 18 '14 at 07:25
-
2… and the drive can't have multiple requests on hand to sort optimally unless you have some way to send it multiple requests simultaneously. For most programs, that would be done via multiple threads calling read() or write(). With the single I/O thread method you proposed, the drive has no way of knowing what the I/O thread has queued up to send it in the future, so it can't take advantage of that information to do TCQ. – Jeremy Friesner Mar 18 '14 at 17:47
-
single thread I/O doesn't neccesarily mean the drive cannot pack requests and sort it out... most of the time I/O R/W is buffered. any way, my statement continues to hold: you cannot have multiple I/O accesses at the very same time moment. – Gianluca Ghettini Mar 19 '14 at 04:11