Bottom up merge sort can work in almost no memory, with the data being sorted kept on external devices (e.g. tapes). I think it's how it used to work in the 50s and 60s, with those big tall tape machines we see in the old movies.
In other words bottom-up merge is an on line algorithm. It can work with no random access at all.
We first process the input tape and write out sorted chunks of 2 elements each as we read the input tape, writing to two tapes, alternating between the two. Then we read by chunks of 2 from the two tapes we've just written to, and write out merged chunks of 4, while alternating between the output tapes. Then we switch the input and the output again and proceed by chunks of 8, etc. In the last run only one tape gets written to - that's the result.
This can be emulated on modern RAM hardware with only O(1) additional memory, with the merging done in-place. To handle the "short dangling tails" problem (for lengths like 2^n+k
, for small k
), the sweep along the input sequence can be made in a forward or backward direction alternatingly.