I need to sort large binary file of size M
, using t
threads. Records in file are all equal size. The task explicitly says that the amount of memory I can allocate is m
, and is much smaller than M
. Also hard drive is guaranteed to have at least 2 * M
free space. This calls for merge sort ofc, but turned out it's not so obvious. I see three different approaches here:
A. Map files input
, temp1
and temp2
into memory. Perform merge sort input -> temp1 -> temp2 -> temp1 ...
until one of temps sorted. Threads only contend for selecting next portion of work , no contention on read/write.
B. fopen 3 files t
times each, each thread gets 3 FILE
pointers, one per file. Again they contend only for next portion of work, reads and writes should work in parallel.
C. fopen 3 files one time each, keep them under mutexes, all threads work in parallel but to grab more work or to read or to write they lock respective mutex.
Notes:
In real life I would choose A for sure. But doesn't it defeat the whole purpose of having limited buffer? (In other words isn't it cheating?). With such approach I can even radix sort whole file in place without extra buffer. Also this solution is Linux-specific, I think Linux is implied from conversation, but it's not stated explicitly in task description.
Regarding B, I think it works on Linux but isn't portable, see Linux note above.
Regarding C, it's portable but I am not sure how to optimize it (e.g. 8 threads with small enough m
will just bump waiting their turn in queue, then read/write tiny portion of data, then instantly sort it and bump into each other again. IMO unlikely to work faster than 1 thread).
Questions:
- Which solution is a better match for the task?
- Which solution is a better design in real life (assuming Linux)?
- Does B work? In other words is opening file multiple times and writing in parallel (to different parts of it) legal?
- Any alternative approaches?